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METHODS AND COMPOSITIONS 
FOR DETERMINING SPECIES OF BACTERIA AND FUNGI 



This application for patent under 35 U.S.C. 111(a) claims priority to Provisional 
Application Serial No. 60/053,097 filed July 25, 1997 under 35 U.S.C. 111(b). This 
invention was made with Government Support under Grant Number DK- R01-AI37728 
awarded by the National Institute of Health. The government may have certain rights in the 
invention. 



FIELD OF THE INVENTION 

The present invention relates to the identification of species, and in particular, met 
and compositions for distinguishing between bacterial and fungal species and determining 
identity of bacterial and fungal pathogens in biological samples. 



BACKGROUND 

The detection and identification of microorganisms recovered from clinical specimens 
or environmental sources is an important aspect of clinical microbiology, as this information 
is important to physicians in making decisions related to methods of treatment In order that 
a particular microorganism is identified correctly and consistently, regardless of the source or 
the laboratory identifying the organism, reproducible systems for identifying microorganisms 
are critical. As stated by Finegold, The primary purpose of nomenclature of microorganisms 
is to permit us to know as exactly as possible what another clinician, microbiologist, 
epidemiologist, or author is referring to when describing an organism responsible for infection 
of an individual or outbreak" (S. Finegold, "Introduction to summary of current nomenclature, 
taxonomy, and classification of various microbial agents," Clin. Infect. Dis., 16:597 [1993]). 

Classification, nomenclature, and identification are three separate, but interrelated 
aspects of taxonomy. Classification is the arranging of organisms into taxonomic groups (i.e., 
taxa) on the basis of similarities or relationships. A multitude of prokaryotic organisms has 
been identified, with great diversity in their types, and many more organisms being 
characterized and classified on a regular basis. 

Classification has been used to organize the seemingly chaotic array of individual 
bacteria into an orderly framework. Through use of a classification framework, a new isolate 
can be more easily be characterized by comparison with known organisms. The choice of 
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criteria for placement into groups is currently somewhat arbitrary, although most 

classifications are based on phylogenetic relationships. An example of the arbitrariness of 

bacterial classification is reflected in the genetic definition of a "species" as being strains of 

bacteria that exhibit 70% DNA relatedness, with 5% or less divergence within related 

5 sequences (Baron et aL, "Classification and identification of bacteria," in Manual of Clinical 

Microbiology, Murray et al (eds.), ASM Press, Washington, D.C., pp. 249-264 [1995]). 

Generally, identification of a bacterium is based on its overall morphological and 

biochemical patterns observed in culture. Indeed, this is the primary technique employed 

today in clinical laboratories. Of course, this approach is flawed by the fact that diverse 

10 organisms can have similar morphologies and/or biochemical requirements. Moreover, 

numerous organisms associated with disease may not be cultured in vitro. Indeed, some do 

not grow well in traditional in vivo culture systems, such as cell cultures or embryonated 

eggs, nor in vitro such as various nutrient agars and broths. 

What is needed is a more defined system for speciation, and in particular, speciation 

15 of bacteria and fungi. Such an approach should be amenable to automation, permitting the 

approach to be used routinely in a clinical laboratory. 

SUMMARY OF THE INVENTION 

The present invention relates to the identification of microbial species, and in 
particular, methods and compositions for determining the species for an unknown bacterium 
(or fungus) in a sample. The methods and compositions of the present invention permit 
distinguishing between bacterial species (or between fungal species) and determining the 
identity of bacterial (or fungal) pathogens in biological samples. The present invention 
contemplates a method of speciation that does not require the sequencing of nucleic acid from 
biological samples. Instead, the method is based on detection of heretofore unknown 
uniquely conserved portions of ribosomal nucleic acid, such portions being conveniently 
revealed by restriction digestion of DNA encoding ribosomal nucleic acid, i.e. rKNA genes. 

In one embodiment of the method of the present invention for speciation, the present 
invention contemplates analysis of one or more so-called Ribosomal operons ( M rrw M ) of a 
clinical isolate, the operon comprising three genes often arranged in the order 16S-23S-5S for 
prokaryotes (and 18S-5.8S-25S for eukaryotes), with "spacer" DNA separating each gene 
(hereinafter represented by: 5'-16S - spacer - 23S - spacer - 5S - 3'). The present invention 
contemplates that the analysis of at least one of these operons in an unknown bacterial or 
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fungal species (when evaluated for the "signature band sets' 1 of a particular species, the 
signature bands and methods for determining signature bands herein described) allows for 
accurate speciation. 

It is not intended that the present invention be limited by the technique by which the 
5 operons are analyzed. In one embodiment, primers directed to these sequences can be 

employed in an amplification reaction (such as PCR). On the other hand, these conserved 
sequences can conveniently be analyzed with restriction enzymes. Specifically, the present 
invention contemplates digesting bacterial or fungal DNA with one or more restriction 
enzymes which will produce a piece of nucleic acid which is within (or bounded by) the 5' 
10 and 3' ends of the operon. The resulting digestion product will be conserved for any given 
species and can serve as a "signature" for that particular species (other species having one or 
more signature bands of a different size). 

Specific embodiments of such a method include (but are not limited to) digestion with 
one or more restriction enzymes so as to produce any one of the following digestion products: 
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The present invention also contemplates a host of variations on the" above digestion products 
by cleaving in the middle of genes and/or in the middle of spacer regions. However, for the 
convenience of detecting such digestion products by gel electrophoresis, it is preferred that 
30 the digestion product (due to the relatively limited resolution level of gel electrophoresis) be 
at least 200 bp in size (and more preferably between 400 and 3000 bp in size). 

In one embodiment, the present invention contemplates digestion of such DNA with 
restriction enzymes that cut only once in the DNA encoding 16S ribosomal RNA and only 
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once in the DNA encoding 23 S ribosomal RNA. In a preferred embodiment, the present 
invention contemplates digestion of bacterial DNA using a single restriction enzyme which 
cuts only once in the DNA encoding 16S ribosomal RNA and only once in the DNA 
encoding 23S ribosomal RNA: 
5 In one embodiment, the present invention contemplates a method for bacterial 

speciation, comprising: i) isolation of bacterial DNA from a sample, said DNA comprising 
DNA encoding 16S and 23S rRNA; ii) digestion of said isolated DNA with one or more 
restriction enzymes under conditions such that restriction fragments are produced, said 
restriction fragments comprising a first digestion product of said DNA encoding 16S and 23 S 

10 rRNA, said first digestion product comprising at least a portion of said DNA encoding 16S 
rRNA and at least a portion of said DNA encoding 23 S rRNA; iii) separation of said 
restriction fragments (e.g. by gel electrophoresis), iv) detection of said first digestion product 

In another embodiment, the present invention contemplates a method for bacterial 
speciation, comprising: i) isolation of bacterial DNA from a sample, said DNA comprising 

15 DNA encoding 16S and 23 S rRNA; ii) digestion of said isolated DNA with one or more 
restriction enzymes under conditions such that restriction fragments are produced, said 
restriction fragments comprising first and second digestion products (e.g. signature bands) of 
said DNA encoding 16S and 23 S rRNA, said first digestion product being larger than said 
second digestion product, and comprising at least a portion of said DNA encoding 16S rRNA 

20 and at least a portion of said DNA encoding 23 S rRNA; iii) separation of said restriction 
fragments (e.g. by gel electrophoresis), iv) detection of said first and second digestion 
products. 

In yet another embodiment, the present invention contemplates a method for bacterial 
speciation, comprising: a) providing i) a first biological sample comprising bacterial DNA 

25 from a known bacterial species, and ii) a second biological sample comprising bacterial DNA 
from a bacterium whose species is unknown; b) isolating i) a first preparation of bacterial 
DNA from said first sample and ii) a second preparation of bacterial DNA from said second 
sample, said DNA of said first and second preparations comprising DNA encoding 16S and 
23S rRNA; c) digesting, in any order, i) said first preparation of isolated DNA with one or 

30 more restriction enzymes under conditions such that a first preparation of restriction 

fragments are produced, said first preparation of restriction fragments comprising a first 
digestion product, said first digestion product comprising at least a portion of said DNA 
encoding 16S rRNA and at least a portion of said DNA encoding 23S rRNA, and ii) said 
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second preparation of isolated DNA with one or more restriction enzymes under conditions 
such that a second preparation of restriction fragments are produced, said second preparation 
of restriction fragments comprising a second digestion product, said second digestion product 
comprising at least a portion df said DNA encoding 16S rRNA and at least a portion of said 
DNA encoding 23S rRNA; d) separating, in any order, i) said restriction fragments (e.g. by 
gel electrophoresis) from said first preparation, and ii) said restriction fragments (e.g. by gel 
electrophoresis) from said second preparation; and e) comparing of said first and second 
digestion products. 

It is convenient to isolate bacterial DNA by lysis of bacteria to release DNA. It is also 
convenient to separate restriction fragments by gel electrophoresis, followed by transfer to a 
membrane for blotting with an oligonucleotide probe. 

It is not intended that the present invention be limited by the nature of the sample. 
The terms "sample" and "specimen" in the present specification and claims are used in their 
broadest sense. On the one hand they are meant to include a specimen or culture. On the 
other hand, they are meant to include both biological and environmental samples. These 
terms encompasses all types of samples obtained from humans and other animals, including 
but not limited to, body fluids such as urine, blood, fecal matter, cerebrospinal fluid (CSF), 
semen, and saliva, cells as well as solid tissue (including both normal and diseased tissue). 
These terms also refers to swabs and other sampling devices which are commonly used to 
obtain samples for culture of microorganisms. In addition, fluids such as IV fluids, water 
supplies and the like are contemplates as samples. 

It is also not intended that the invention be limited by the particular purpose for 
carrying out the biological reactions. The present invention is applicable to medical testing, 
food testing, agricultural testing and environmental testing. In one medical diagnostic 
application, it may be desirable to simply detect the presence or absence of specific pathogens 
(or pathogenic variants) in a clinical sample. In yet another application, it may be desirable 
to distinguish one species or strain from another. 

With regard to distinguishing different species, in one embodiment, the present 
invention contemplates comparing two samples suspected to be different species. In another 
embodiment, a species that is suspected to have changed or diverged from the parent species 
is compared with the parent species. For example, a species or strain of bacteria may develop 
a different susceptibilities to a drug (e.g. antibiotics) as compared to the parent species; rapid 
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identification of the specific species or subspecies aids diagnosis and allows initiation of 
appropriate treatment 

It is not intended that the present invention be limited by the means of detection or the 
means of comparing first and second digestion products. In one embodiment, said digestion 
5 products that are separated by gel electrophoresis are probed with a labeled oligonucleotide in 
a hybridization reaction. 

The present invention can be used with particular success when comparing samples. 
In one embodiment, the present invention contemplates a method of analyzing nucleic acid in 
biological samples, comprising: a) providing: i) first and second samples comprising bacterial 
10 nucleic acid, ii) a restriction enzyme capable of generating a restriction fragment with (or 

bounded by) the 5* and 3' ends of a bacterial Ribosomal operon b) treating said nucleic acid 
of each of said two samples under conditions so as to produce restriction fragments; c) 
separating said restriction fragments; and d) comparing said restriction fragments from said 
first and second samples. 
IS It is not intended that the present invention be limited by the number or nature of 

samples compared. Clinical, food, agricultural, and environmental samples are specifically 
contemplated within the scope of the present invention. 

The present invention contemplates using restriction enzymes wherein the 
corresponding restriction enzyme recognition sequence exists only once in the 16s and 23 s 
20 nucleic acid. Alternatively, restriction enzymes can be selected based on the known nucleic 
acid sequences {see e.g. Figures 4 and 7). 

DEFINITIONS 

To facilitate understanding of the invention, a number of terms are defined below. 
25 "Nucleic acid sequence" and "nucleotide sequence" as used herein refer to an 

oligonucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of 
genomic or synthetic origin which may be single- or double-stranded, and represent the sense 
or antisense strand. 

Prokaryotic ribosomes are constructed from SOS and 30S subunits that join together to 
30 form a 70S ribosome. The large subunit comprises a single "23S rRNA" molecule and a "5S 
rKNA" molecule, while the small subunit comprises a single "16S rRNA" molecule. 

As used herein, the terms "complementary" or "complementarity" are used in reference 
to "polynucleotides" and "oligonucleotides" (which are interchangeable terms that refer to a 
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sequence of nucleotides) related by the base-pairing rules. For example, the sequence "C-A- 
G-T," is complementary to the sequence "G-T-C-A." 

Complementarity can be "partial" or "total." "Partial" complementarity is where one 
or more nucleic acid bases is not matched according to the base pairing rules. "Total" or 
5 "complete" complementarity between nucleic acids is where each and every nucleic acid base 
is matched with another base under the base pairing rules. The degree of complementarity 
between nucleic acid strands has significant effects on the efficiency and strength of 
hybridization between nucleic acid strands. This is of particular importance in amplification 
reactions, as well as detection methods which depend upon binding between nucleic acids. 

10 Ribosomal RNA molecules are characterized bythe presence of numerous sequences 

that can form complementary base pairs with sequences located else where in the same 
molecule. Such interactions cause rRNA molecules to fold into three-dimensional 
configurations that exhibit localized double-stranded regions. 

As used herein, the term "gene" means the deoxyribonucleotide sequences comprising 

15 the coding region and including sequences located adjacent to the coding region on both the 
5' and 3* ends typically for a distance of about 1-3 kb on either end such that the gene 
corresponds to the length of the full-length mRNA. The sequences which are located 5' of 
the coding region and which are present on the mRNA are referred to as 5' non-translated 
sequences. The sequences which are located 3' or downstream of the coding region and 

20 which are present on the mRNA are referred to as 3* non-translated sequences. The term 
"gene" encompasses both cDNA and genomic forms of a gene. 

The chromosomal DNA of prokaryotic cells contains multiple copies of the genes 
coding for rRNAs. For example, the bacterium E. coli contains seven sets of rRNA genes. 
In the rRNA transcription unit of E. coli 9 the three genes are typically arranged in the order 

25 16S-23S-5S, with "spacer" DNA separating each gene (the spacer DNA separating 23S from 
16S typically comprises one or more tRNA genes in addition to unencoded). 

The terms "homology" and "homologous" as used herein in reference to nucleotide 
sequences refer to a degree of complementarity with other nucleotide sequences. There may 
be partial homology or complete homology (i.e. 9 identity). A nucleotide sequence which is 

30 partially complementary, i.e., "substantially homologous," to a nucleic acid sequence is one 
that at least partially inhibits a completely complementary sequence from hybridizing to a 
target nucleic acid sequence. The inhibition of hybridization of the completely 
complementary sequence to the target sequence may be examined using a hybridization assay 
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(Southern or Northern blot, solution hybridization and the like) under conditions of low 

stringency. A substantially homologous sequence or probe will compete for and inhibit the 
binding (z.e., the hybridization) of a completely homologous sequence to a target sequence 
under conditions of low stringency. This is not to say that conditions of low stringency are 
5 such that non-specific binding is permitted; low stringency conditions require that the binding 
of two sequences to one another be a specific (i.e., selective) interaction. The absence of 
non-specific binding may be tested by the use of a second target sequence which lacks even a 
partial degree of complementarity (e.g., less than about 30% identity); in the absence of non- 
specific binding the probe will not hybridize to the second non-complementary target. 

10 Low stringency conditions comprise conditions equivalent to binding or hybridization 

at 42°C in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH 2 P0 4 *H 2 0 and 1.85 
g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5X Denhardt's reagent [50X 
Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharmacia), 5 g BSA (Fraction V; 
Sigma)] and 100 )!g/ml denatured salmon sperm DNA followed by washing in a solution 

15 comprising 5X SSPE, 0.1% SDS at 42°C when a probe of about 500 nucleotides in length is 
employed. 

Other equivalent conditions may be employed to comprise low stringency conditions; 
factors such as the length and nature (DNA, RNA, base composition) of the probe and nature 
of the target (DNA, RNA, base composition, present in solution or immobilized, etc.) and the 

20 concentration of the salts and other components (e.g., the presence or absence of formamide, 
dextran sulfate, polyethylene glycol), as well as components of the hybridization solution may 
be varied to generate conditions of low stringency hybridization different from, but equivalent 
to, the above listed conditions. In addition, conditions which promote hybridization under 
conditions of high stringency can be used (e.g., increasing the temperature of the 

25 hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.). 

When used in reference to a double-stranded nucleic acid sequence such as a cDNA or 
genomic clone, the term "substantially homologous" refers to any probe which can hybridize 
to either or both strands of the double-stranded nucleic acid sequence under conditions of low 
stringency as described above. 

30 When used in reference to a single-stranded nucleic acid sequence, the term 

"substantially homologous" refers to any probe which can hybridize (i.e., it is the complement 
of) the single-stranded nucleic acid sequence under conditions of low stringency as described 
above. 
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As used herein, the term "hybridization" is used in reference to the pairing of 
complementary nucleic acids using any process by which a strand of nucleic acid joins with a 
complementary strand through base pairing to form a hybridization complex. Hybridization 
and the strength of hybridization (i.e., the strength of the association between the nucleic 
5 acids) is impacted by such factors as the degree of complementarity between the nucleic 

acids, stringency of the conditions involved, the T m of the formed hybrid, and the G:C ratio 

within the nucleic acids. 

As used herein the term "hybridization complex" refers to a complex formed between 
two nucleic acid sequences by virtue of the formation of hydrogen bonds between 
10 complementary G and C bases and between complementary A and T bases; these hydrogen 
bonds may be further stabilized by base stacking interactions. The two complementary 
nucieic acid sequences hydrogen bond in an antiparaUel configuration. A hybridization 
complex may be formed in solution (e.g., or IV analysis) or between one nucleic acid 
sequence present in solution and another nucleic acid sequence immobilized to a solid support 
15 [e.g., a nylon membrane or a nitrocellulose filter as employed in Southern and Northern 
blotting, dot blotting or a glass slide as employed in in situ hybridization, including FISH 
(fluorescent in situ hybridization)]. 

As used herein, the term "T m " is used in reference to the "melting temperature." The 
melting temperature is the temperature at which a population of double-stranded nucleic acid 
20 molecules becomes half dissociated into single strands (the mid-point). The equation for 
calculating the T m of nucleic acids is well known in the art As indicated by standard 
references, a simple estimate of the T m value may be calculated by the equation: T. - 81.5 + 
0.41(% G + C), when a nucleic acid is in aqueous solution at 1 M NaCl [see e.g., Anderson 
and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985)]. Other 
25 references include more sophisticated computations which take structural as well as sequence 
characteristics into account for the calculation of T m . 

As used herein the term "stringency" is used in reference to the conditions of 
temperature, ionic strength, and the presence of other compounds such as organic solvents, 
under which nucleic acid hybridizations are conducted. "Stringency" typically occurs in a 
30 range from about T m -5°C (5°C below the T m of the probe) to about 20°C to 25°C below T m . 
As will be understood by those of skill in the art, a stringent hybridization can be used to 
identify or detect identical polynucleotide sequences or to identify or detect similar or related 
polynucleotide sequences. 
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As used herein, the tern "amplifiable nucleic acid" is used in reference to nucleic 
acids which may be amplified by any amplification method. It is contemplated that 
"amplifiable nucleic acid" will usually comprise "sample template." 

As used herein, the term "sample template" refers to nucleic acid originating from a 
5 sample which is analyzed for the presence of a target sequence of interest In contrast, 

"background template" is used in reference to nucleic acid other than sample template which 
may or may not be present in a sample. Background template is most often inadvertent It 
may be the result of carryover, or it may be due to the presence of nucleic acid contaminants 
sought to be purified away from the sample. For example, nucleic acids from organisms 
10 other than those to be detected may be present as background in a test sample. 

"Amplification" is defined as the production of additional copies of a nucleic acid 
sequence and is generally carried out using polymerase chain reaction technologies well 
known in the art pieffenbach CW and GS Dveksler (1995) PCR Primer, a Laboratory 
Manual, Cold Spring Harbor Press, Plainview NY]. As used herein, the term "polymerase 
15 chain reaction" ("PCR") refers to the method of K.B. Mullis U.S. Patent Nos. 4,683,195 and 
4,683,202, hereby incorporated by reference, which describe a method for increasing the 
concentration of a segment of a target sequence in a mixture of genomic DNA without 
cloning or purification. The length of the amplified segment of the desired target sequence is 
determined by the relative positions of two oligonucleotide primers with respect to each other, 
20 and therefore, this length is a controllable parameter. By virtue of the repeating aspect of the 
process, the method is referred to as the "polymerase chain reaction" (hereinafter "PCR"). 
Because the desired amplified segments of the target sequence become the predominant 
sequences (in terms of concentration) in the mixture, they are said to be "PCR amplified". 
With PCR, it is possible to amplify a single copy of a specific target sequence in 
25 genomic DNA to a level detectable by several different methodologies (e.g., hybridization 
with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme 
conjugate detection; incorporation of 3? P-labeled deoxynucleotide triphosphates, such as dCTP 
or dATP, into the amplified segment). In addition to genomic DNA, any oligonucleotide 
sequence can be amplified with the appropriate set of primer molecules. In particular, the 
30 amplified segments created by the PCR process itself are, themselves, efficient templates for 
subsequent PCR amplifications. 

Amplification in PCR requires "PCR reagents" or "PCR materials", which herein are 
defined as all reagents necessary to carry out amplification except the polymerase, primers 
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and template. PCR reagents nomally include nucleic acid precursors (dCTP, dTTP etc.) and 
buffer. 

As used herein, the term "primer" refers to an oligonucleotide, whether occurring 
naturally as in a purified restriction digest or produced synthetically, which is capable of 
acting as a point of initiation of synthesis when placed under conditions in which synthesis of 
a primer extension product which is complementary to a nucleic acid strand is induced, (i.e., 
in the presence of nucleotides and an inducing agent such as DNA polymerase and at a 
suitable temperature and pH). The primer is preferably single stranded for maximum 
efficiency in amplification, but may alternatively be double stranded. If double stranded, the 
primer is first treated to separate its strands before being used to prepare extension products. 
Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long 
to prime the synthesis of extension products in the presence of the inducing agent. The exact 
lengths of the primers will depend on many factors, including temperature, source of primer 

and the use of the method. 

As used herein, the term "probe" refers to an oligonucleotide (Le., a sequence of 
nucleotides), whether occurring naturally as in a purified restriction digest or produced 
synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to 
another oligonucleotide of interest A probe may be single-stranded or double-stranded. 
Probes are useful in the detection, identification and isolation of particular gene sequences. It 
is contemplated that any probe used in the present invention will be labelled with any 
"reporter molecule," so that it is detectable using any detection system, including, but not 
limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, 
radioactive, and luminescent systems. It is not intended that the present invention be limited 
to any particular detection system or label. 

As used herein, the terms "restriction endonucleases" and "restriction enzymes" refer to 
bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide 
sequence. Such enzymes can be used to create Restriction Fragment Length Polymorphisms 
(RFLPs). RFLPs are in essence, unique fingerprint snapshots of a piece of DNA, be it a 
whole chromosome (genome) or some part of this, such as the regions of the genome that 
specifically flank ribosomal operons. All such RFLP fingerprints are indicative of the random 
mutations in all DNA molecules that inevitably occur over evolutionary time. Because of this, 
if properly interpreted, evolutionary relatedness of any two genomes can be compared based 
on the fundamental assumption that all organisms have had a common ancestor. Thus, the 
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greater the difference in RFLP fingerprint profiles, the greater the degree of evolutionary 

divergence between them (although there are exceptions). With such an understanding, it then 

becomes possible, using appropriate algorithms, to covert RFLP profiles of a group of 

organisms (e.g. bacterial isolates) into a phylogenic (evolutionary) tree. 

5 RFLPs are generated by cutting ("restricting") a DNA molecule with a restriction 

endonuclease. Many hundreds of such enzymes have been isolated, as naturally made by 

bacteria. In essence, bacteria use such enzymes as a defensive system, to recognize and then 

cleave (restrict) any foreign DNA molecules which might enter the bacterial cell (e.g. a viral 

infection). Each of the many hundreds of different restriction enzymes has been found to cut 

10 (i.e. "cleave" or "restrict") DNA at a different sequence of the 4 basic nucleotides (A, T, G, 
C) that make up all DNA molecules, e.g. one enzymes might specifically and only recognize 
the sequence A-A-T-G-A-C, while another might specifically and only recognize the sequence 
G-T-A-C-T-A, etc. etc. Dependent on the unique enzyme involved, such recognition 
sequences vary in length, from as few as 4 nucleotides (e.g. A-T-C-C) to as many as 21 

15 nucleotides (A-T-C-C-A-G-G-A-T-G-A-C-A-A-A-T-C-A-T-C-G). From here, the simplest 

way to consider the situation is that the larger the recognition sequence, the fewer restriction 
fragments will result as the larger the recognition site, the lower the probability is that it will 
repeatedly be found throughout the genomic DNA. 

In one embodiment, the present invention utilizes the restriction enzyme called EcoRI 

20 which has a 6 base pair (nucleotide) recognition site. Thus, given that there exist but 4 

nucleotides (A,T,G,C), the probability that this unique 6 base recognition site will occur is 4 6 , 
or once in every 4,096 nucleotides. Given that the H. influenzae ("Hi") genome (chromosome) 
is approximately 2 x 10 6 bp (base pairs) in length, digestion of this DNA with EcoRI 
theoretically should yield 488 fragments. This varies significantly from isolate to isolate of H. 

25 influenzae because of "random mutations" that inevitably occurs over evolutionary time, some 
of which either destroy an EcoRI sequence cutting site, or create a new one. As such, the 
overall degree of variation in EcoRI RFLP profiles among a series of isolates within a given 
species such as K influenzae, is indicative of the degree of genetic relatedness of these 
isolates (although there are exception). Using appropriate algorithms, such RFLP profiles are 

30 readily converted to "phylogenetic trees" (see e.g. Figure 3) which are simply a diagrammatic 
figures indicating the evolutionary divergence of isolates from some theoretically common 
ancestor. 
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Once the genomic (chromosomal) DNA of a bacterial isolate has been isolated, it is 
then digested (cut) with an enzyme such as EcoRl. Following the digestion, the resultant 
individual fragments are separated from one another based on their sizes. This can be done 
by using agarose gel electrophoresis. In essence, during electrophoresis the smaller molecules 
(DNA fragments) move faster than larger one and thus the resultant separation is a gradient 
from the largest to the smallest fragments. These can easily be visualized as bands down the 
electrophoresis gel, from the top to the bottom with the smallest fragments bottom-most. 

Using ribotyping methodology, DNA fragments involving the multiple (e.g. 6 for the 
case of H. influenzae, 7 for the case of E. coli, etc) ribosomal operons and the immediately 
flanking DNA sequences (genes) can be distinguished by hybridization of the resultant 
electrophoresis separated DNA fragments with a radioactively labeled ribosomal operon DNA 
probe. This then reduces the total number of visualized DNA fragments (predicted above to 
be approximately 488 restriction fragments) to those only including or immediately flanking 
the RNA operons, about 14 fragments in toto for H. influenzae. Nonetheless, because of 
inevitable random background mutation indicative of evolutionary time, with the exception of 
very recently evolved clones, every independent isolate of H. influenzae will have a variant 
EcoRI ribotype RFLP profile. And the more variant, the more distantly related will be any 
two isolates so compared. In contrast, rigorous conservation of 16S and 23S rRNA sequences 
makes possible the unique species-specific RFLPs produced according to the methods and 
compositions of the present invention. 

DNA molecules are said to have "5' ends" and "3' ends" because mononucleotides are 
reacted to make oligonucleotides in a manner such that the 5' phosphate of one 
mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in one direction via 
a phosphodiester linkage. Therefore, an end of an oligonucleotide is referred to as the "5' 
end" if its 5' phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring. An 
end of an oligonucleotide is referred to as the "3' end" if its 3' oxygen is not linked to a 5' 
phosphate of another mononucleotide pentose ring. As used herein, a nucleic acid sequence, 
even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends. In either 
a linear or circular DNA molecule, discrete elements are referred to as being "upstream" or 5' 
of the "downstream" or 3' elements. This terminology reflects the fact that transcription 
proceeds in a 5' to 3' fashion along the DNA strand. 

As used herein, the term "an oligonucleotide having a nucleotide sequence encoding a 
gene" means a nucleic acid sequence comprising the coding region of a gene, i.e. the nucleic 
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acid sequence which encodes a gene product The coding region may be present in either a 
cDNA, genomic DNA or RNA form. When present in a DNA form, the oligonucleotide may 
be single-stranded (i.e., the sense strand) or double-stranded 

As used herein, the terms "nucleic acid molecule encoding," "DNA sequence 
5 encoding," and "DNA encoding" refer to the order or sequence of deoxyribonucleotides along 
a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the 
order of amino acids along the polypeptide (protein) chain. The DNA sequence thus codes 
for the amino acid sequence. 

The term "Southern blot" refers to the analysis of DNA on agarose or acrylamide gels 
10 to fractionate the DNA according to size, followed by transfer and immobilization of the 
DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The 
immobilized DNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA 
probe to detect DNA species complementary to the probe used. The DNA may be cleaved 
with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be 
15 partially depurinated and denatured prior to or during transfer to the solid support. Southern 
blots are a standard tool of molecular biologists [J. Sambrook et al (1989) Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY, pp 9.31-9.58]. 

The term "Northern blot" as used herein refers to the analysis of RNA by 
electrophoresis of RNA on agarose gels to fractionate the RNA according to size followed by 
20 transfer of the RNA from the gel to a solid support, such as nitrocellulose or a nylon 

membrane. The immobilized RNA is then probed with a labeled oligo-deoxyribonucleotide 
probe or DNA probe to detect RNA species complementary to the probe used. Northern blots 
are a standard tool of molecular biologists [J. Sambrook, J. et al (1989) supra, pp 7.39-7.52]. 
The term "reverse Northern blot" as used herein refers to the analysis of DNA by 
25 electrophoresis of DNA on agarose gels to fractionate the DNA on the basis of size followed 
by transfer of the fractionated DNA from the gel to a solid support, such as nitrocellulose or 
a nylon membrane. The immobilized DNA is then probed with a labeled oligo-ribonuclotide 
probe or RNA probe to detect DNA species complementary to the ribo probe used. 
The term "isolated" when used in relation to a nucleic acid, as in "an isolated 
30 oligonucleotide" refers to a nucleic acid sequence that is identified and separated from at least 
one contaminant nucleic acid with which it is ordinarily associated in its natural source. 
Isolated nucleic acid is nucleic acid present in a form or setting that is different from that in 
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Which it is found in nature. In contrast, non-isolated nucleic acids are nucleic acids such as 

DNA and RNA which are found in Hie state they exist in nature. 

As used herein, the term "purified" or "to purify" refers to the removal of undesired 

components from a sample. 

As used herein, the term "substantially purified" refers to molecules, either nucleic or 
amino acid sequences, that are removed from their natural environment, isolated or separated, 
and are at least 60% free, preferably 75% free, and most preferably 90% free from other 
components with which they are naturally associated. An "isolated polynucleotide" is 
therefore a substantially purified polynucleotide. 

The term "sample" as used herein is used in its broadest sense and includes 
environmental and biological samples. Environmental samples include material from the 
environment such as soil and water. Biological samples may be animal, including, human, 
fluid (e.g., blood, plasma and serum), solid (e.g., stool), tissue, liquid foods (e.g., milk), and 

solid foods (e.g., vegetables). 

The term "bacteria" and "bacterium" refer to all prokaryotic organisms, mcluding those 
within all of the phyla in the Kingdom Procaryotae. It is intended that the term encompass 
all microorganisms considered to be bacteria including Mycoplasma, Chlamydia, Actinomyces, 
Streptomyces, and Rickettsia. All forms of bacteria are included within this definition 
including cocci, bacilli, spirochetes, spheroplasts, protoplasts, etc. Also included within this 
term are prokaryotic organisms which are gram negative or gram positive. "Gram negative" 
and "gram positive" refer to staining patterns with the Cham-staining process which is well 
known in the art [Finegold and Martin, Diagnostic Microbiology, 6th Ed. (1982), CV Mosby 
St. Louis, pp 13-15]. "Gram positive bacteria" are bacteria which retain the primary dye used 
in the Gram stain, causing the stained cells to appear dark blue to purple under the 
microscope. "Gram negative bacteria" do not retain the primary dye used in the Gram stain, 
but are stained by the counterstain. Thus, gram negative bacteria appear red. 

DESCRIPTION OF THE DRAWINGS 

Figure 1 schematically shows the 6 Ribosomal operons of the genomically sequenced 

H. influenzae strain Rd. 

Figure 2 is an autoradiograph of EcoRl RFLPs of H. influenzae isolates from diverse 

sources, including the genomically sequenced strain Rd. 
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Figure 3 is an EcoRI based phylogenic tree of a diverse collection of H. influenzae 
isolates (type "a" through "F, and non-typeable) from variant clinical and environmental 
sources and geographical locales, showing the signature bands of H influenza with this 
restriction enzyme. 

5 Figure 4 shows the DNA sequence and restriction map of the H. influenzae Rd rrnA 

and Rd rrriQ operons (16s-spacer-23S-spacer-5S), with restriction sites noted for enzymes 
cutting 5 times or less (an alphabetical list of restriction enzymes that cut H. influenzae Rd 
rrnA and Rd rmB operons 5 times or less is set forth in Table 2), with positions of restriction 
sites indicated. While the genome of Hi Rd contains 6 ribosomal operons, all are identical to 
10 the sequences presented here for either rrnA or rmB. 

Figure 5 shows the 7 ribosomal operons of the genomically sequenced E. coli strain 
MG 1655. 

Figure 6 is an autoradiograph of the EcoRI RFLPs of E coli isolates from diverse 
sources, including the genomically sequenced strain MG 1655, showing the signature bands 
15 for this species using this restriction enzyme. 

Figure 7 shows the DNA sequence and restriction map of the E. coli MG 1655 rrn 
("a" through "h") operons (16S-spacer-23S-spacer-5S), with restriction sites noted for 
enzymes cutting 5 times or less. 

Figure 8 is an autoradiograph of RFLP data for B. cepacia, showing signature bands 
20 for this species. 



DESCRIPTION OF THE INVENTION 

The present invention relates to the identification of species, and in particular, methods 
25 and compositions for determining the species for an unknown bacterium (or fungus) in a 
sample. The methods and compositions of the present invention permit distinguishing 
between bacterial species (or between fungal species) and determining the identity of bacterial 
(or fungal) pathogens in biological samples. In one embodiment, the present invention 
contemplates the use of restriction enzymes followed by probing with an oligonucleotide 
30 capable of hybridizing to fragments comprising at least a portion of DNA encoding 16S 
and/or 23 S rRNA. In this manner, the present invention applies, in one embodiment, the 
"discriminatory power" of the methodology of ribotyping to the speciation of microbes for the 
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first time. The potential use of ribotyping as a method for speciation has been completely 

overlooked. 

To date, ribotyping has been applied for the purpose of examining differences 
WITHIN a species. Specifically, ribotyping has been employed for the purpose of 

5 epidemiological 'typing' within a given species, where 'variability' of the ribotype RFLP 
profiles of individual isolates, one clinical isolate versus another clinical isolate, was of 
interest for intra-species discriminatory purposes (for example, to determine whether or not 
bacterial isolates within a known species were from an epidemic cluster involving a single 
clone spread among patients). As such, the conserved species-specific signature bands were 

10 not recognized as relevant Instead, the variable bands making up the ribotype profile have 
been of interest for discriminatory epidemiological purposes and phylogenetic tree building. 

The present invention, by contrast, generates a species-conserved set of RFLP bands, 
unique for each species. While of no interest for intra-species discrimination, these species- 
conserved sets represent precise markers appropriate for inter-species discriminatory purposes 

15 (i.e. to determine per se> the species of a given, unknown isolate - which is a most needed 
assay in the clinical microbiology lab of a hospital). Since all bacterial species examined by 
the inventor display a conserved set of species-specific signature RFLP bands, unique for 
every species, Ribosomal operon-based discrimination of these unique species specific bands 
represents the most practical means available for speciation of bacteria (in that the method is 

20 less tedious and far more applicable - as compared to sequencing - to the clinical 
microbiology setting). 

It must be stressed that the polymorphisms currently exploited in conventional, 
epidemiological and phylogenetic ribotyping are polymorphisms that are not directly related 
to ribosomal operon sequences. Rather, because of the conservation of DNA encoding 16s 

25 and 23s rRNA within any species, polymorphisms typically result from variation in closest 
flanking sequences (that is to say, nucleic acid falling outside of the region defined by: 5'- 
16S - spacer - 23S - spacer - 5S - 3'). This point can be readily illustrated with the strain Hi 
Rd, because the complete chromosomal sequence of this strain is known. In this regard, it 
can be seen from Figures 1 and 2 that it is possible to predict the precise size of the 12 

30 different flank sequences generated by an EcoRI digestion (or the fragments generated with 
any other restriction enzyme for that matter) of the 6 rrn operons of strain Rd. With such 
knowledge of the RFLP profile of the sequenced Hi strain Rd, using molecular genetic 
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methods (such as hybridization), it is possible to precisely analyze any alterations from this 

prototypic ribotype fingerprint as found among other Hi isolates. 

From this example with Hi, it should be clear that the polymorphisms generated by the 
conventional ribotyping technique have nothing directly to do with variability of Ribosomal 
5 operon sequences. Rather, these polymorphisms result from variations in the neutral genes 
that are genetically-linked to (i.e. that flank) the multiple ribosomal operons encoded by all 
bacterial chromosomes. 

In constrast to conventional ribotyping, the present invention utilizes the Ribosomal 
operon sequences which vary less than 3% (and more preferably less than 2%) within a 
10 species but vary between species. The description of the invention involves the I) Preparation 
of Nucleic Acid from Samples; II) Selection of A Restriction Enzyme, III) Design of the 
Probe, IV) Comparing Biological Samples, and V) Speciation In A Clinical Setting. 

L The Preparation of Nucleic Acid 
15 A. DNA Preparation 

The nucleic acid content of cells consists of deoxyribonucleic acid (DNA) and 
ribonucleic acid (RNA). With respect to DNA preparation, a variety of preparation schemes 
are possible. Typically, the steps involved in purification of nucleic acid from cells include 1) 
cell lysis; 2) inactivation of cellular nucleases; and 3) separation of the desired nucleic acid 

20 from the cellular debris and other nucleic acid. Cell lysis may be achieved through various 
methods, including enzymatic, detergent or chaotropic agent treatment. Inactivation of 
cellular nucleases may be achieved by the use of proteases and/or the use of strong denaturing 
agents. Finally, separation of the desired nucleic acid can be achieved by extraction of the 
nucleic acid with solvents (e.g. phenol or phenol-chloroform); this method partitions the 

25 sample into an aqueous phase (which contains the nucleic acids) and an organic phase (which 
contains other cellular components, including proteins). 

B. RNA Preparation 

It is preferred that the present invention utilize DNA and restriction enzymes to 
30 analyze bacterial and fungal Ribosomal operon conserved sequences. On the other hand, such 
conserved sequences may also be examined in the form of 16S, 23 S and/or 5S rRNA. For 
example, such rRNA may be used as template in a PCR reaction with primers (typically DNA 
primers) capable of amplying such rRNA. 
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It should be stressed, however, that the preparation of RNA is complicated by the 
presence of ribonucleases that degrade RNA (e.g., T. Maniatis et aL, Molecular Cloning, pp. 
188-190, Cold Spring Harbor Laboratory [1982]). Furthermore, the preparation of 
amplifiable RNA is made difficult by the presence of ribonucleoproteins in association with 
5 RNA. ( See, R. J. Slater, In: Techniques in Molecular Biology. J.M. Walker and W. Gaastra, 
eds., Macmillan, NY, pp. 113-120 [1983]). 

II. Selection Of A Restriction Enzyme 

As noted above, the present invention contemplates in one embodiment that conserved 

10 sequences can conveniently be analyzed with restriction enzymes. Specifically, the present 
invention contemplates digesting bacterial or fungal DNA with one or more restriction 
enzymes which will produce a piece of nucleic acid which is within (or bounded by) the 5' 
and 3' ends of the Ribosomal operon. The resulting digestion product will be conserved for 
any given species and can serve as a "signature" for that particular species (other species 

15 having one or more signature bands of a different size). 

A variety of restriction enzymes (and corresponding restriction sites) are contemplated. 
Given the sequence of the Ribosomal operon for any particular species, restriction enzymes 
can be selected on the basis of primary structure of the DNA. However, in a preferred 
embodiment, restriction enzymes are selected based on ultraconserved sequences within the 

20 Ribosomal operon; these sequences encode rRNA that takes part in the formation of 

secondary structures and are known to be more highly conserved because they must fold on 
themselves (forming secondary structures through Watson/Crick hydrogen bonding). Such 
sequences encoding rRNA involved in secondary structures are known for some organisms 
and can readily be determined from the primary structure of the ribosomal DNA for other 

25 species using commerically available computer programs. 

m. Design of The Probe 

In the nucleic acid hybridization step of the method of the present invention, the test 
DNA is denatured and exposed to denatured DNA of known sequence (i.e. "the probe") from 
30 a particular organism. The amount of hybridization between the test DNA and known DNA 
provides an indication of the degree of relatedness between the test and known organisms. 
An important drawback to this approach is that hybridization between two single DNA 
strands can occur even when 15% of the sequences are not complementary. Moreover, to 
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identify appropriate restriction fragments, one must be able to identify restriction fragments 

that contain only very short regions (as short as 10 bases) of the 16s, 23s or 5S nucleic acid. 

Regardless of these constraints, based on the knowledge of the specific Ribosomal 
operon DNA sequences of a particular species of bacteria which are recognized by particular 
5 restriction endonuclease ("RE"), the present invention contemplates a probe that can be 
designed to ensure a specific reaction. 

The most general ribosomal RNA probe substrate applicable is obtained from 
purification of bulk ribosomal RNA (16S, 23S and 5S) molecules [See e.g. LiPuma et al„ J. 
Pediatrics 113:859 (1988)]. A more convenient approach is one using a cloned ribosomal 
10 operon which is then digested from the cloning vector, separated by electrophoresis, removed 
from the electrophoretic gel, and then used as probe substrate [see e.g. Arthur et al M Infection 
& Immunity 58:471 (1990)]. 

The present invention contemplates a variety of methods for labeling probes, including 
but not limited to isotopically labeling probes. In one embodiment, nick translation is 
15 employed. Briefly, the DNA is lightly "nicked" (single-stranded breaks) with DNAase, and a 
DNA polymerase which can displace strands at nicks polymerizes DNA using the strand that 
has not been displaced as template. The nucleoside triphosphates are tagged with isotopes (or 
other detectable groups) and the polymerase introduces such markers into the nicked DNA. 

In another embodiment, the probe is made by random priming. Briefly, the DNA is 
20 denatured. Thereafter, small, random oligonucleotides, a labeled substrate, buffers and a 
DNA polymerase which has no 3' -OH editing function are added. The random 
oligonucleotides hybridize to places on the DNA and serve as primers for the synthesis of 
new, labeled DNA. 

In yet another embodiment, the probe is end labeled. Briefly, either a kinase attaches 
25 a labeled phosphate to the 3'-OH of the DNA or a DNA polymerase with 3' editing function 
is forced to depolymerize from the 3' end; the resulting single-stranded DNA is used as a 
template to synthesize labeled DNA. 

IV. Comparing Biological Samples 

30 The present invention contemplates, in one embodiment, using electrophoresis to 

separate RFLP fragments for the comparison of the results between samples. Such an 
approach can utilize control samples or control fragments to ensure the identification of 
"signature bands" for a particular species. Moreover, it may be convenient to detect ONLY 
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the signature bands; this can be done by a variety of methods, including but not limited to the 
isolating of the signature bands (i.e. free of other restriction fragments). Finally, it may be 
desirable to automate the analysis. 



A. Control Samples 

In one embodiment, the present invention contemplates a method wherein a sample of 
a known bacterial or fungal species is treated in parallel with the test samples). In such an 
approach, the known species is treated with the same restriction enzyme(s) and the resulting 
fragments are placed in a control lane of the gel, permitting comparison of fragments between 
the control samples and the test sample(s). Likewise the control may comprise other types or 
combinations of DNA fragments of known size extracted and prepared for this purpose. 



B. Control Fragments 

While treating a control sample in parallel is readily done, it may be more convenient 
to run pre-digested control bands along with the test sample(s). In such a case, the restriction 
fragments from the pre-digested known sample are simply added to a control lane at the time 
the test samples have been processed to make them ready for gel electrophoesis. 

C. Detecting ONLY The Signature Bands 

It may be convenient to detect ONLY the signature bands when comparing samples. 
This can be done by a variety of methods, including but not limited to the isolating of the 
signature bands (i.e. free of other restriction fragments). In one embodiment, the present 
invention contemplates using electrophoresis in combination with a means for sizing the 
fragment (e.g. HPLC or Mass Spectrometry). In such an approach, restriction enzymes can 
be utilized that generate the smallest fragment so that this fragment (or fragments) will elute 
from bottom of the gel prior to the other fragments. The eluted fragment can immediately be 
examined for size to confirm that the signature band is present or absent in the test sample. 

Similarly, the gel for gel electrophoresis can be prepared so as to permit the separation 
of only fragments in the size range of the signature bands. For example, larger bands capable 
of hybridizing to the probe would remain at the top of the gel (or be only poorly resolve near 
the top of the gel). 

Also, PCR amplification based on primers including a known restriction site in the 
conserved region followed by hybridization can be employed. 
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D. Automation 

The present invention contemplates the automation of analysis. In this regard, the 
present invention specifically contemplates the utilization of the Quaiicon (a Dupont 
subsidiary) "RiboPrinter System" - which is a fast automated apparatus that is (with some 
5 modifications, including but not limited to, the provision of marker DNA comprising 

signature bands) amenable to the automation of some of the above-described methods. In 
operation, single colonies from 8 unknown microbes are inoculatd directly into a sample 
carrier into which a "DNA pre pack" is added that contains lysis buffer (enzymes to break 
open bacteria, along with restriction endonucleases for cutting genomic DNA, along with 

10 marker DNA molecules for comparative sizing of RFLP profiles). After initial heat 

inactivation of colonies, followed by cell lysis and restriction of the DNA, the DNA is then 
automatically extracted and restriction fragments separated according to size by gel 
electrophoresis, and then transferred to a hybridization membrane. DNA is then automatically 
hybridized to a labeled ribosomal operon probe, after which a chemiluminescent agent is 

IS introduced. Emission of light from hybridized fragments is captured by digitizing camera and 
stored as image data. Using proprietary algorithms, a RiboPrint pattern for each sample is 
extracted from the image data. This pattern can then be compared to other RiboPrint RFLP 
profiles stored in the system. Such results can be generated every 8 hours, with analysis of 
the next set of 8 samples begun 2 hours after the first 

20 The present invention also contemplates a new means for resolving species specific 

ribosomal RNA bands. This involves hybridization in solution following restriction digestion 
of the unknown chromosomal DNA sample after which unbound chemiluminescent probe is 
removed and the sample is electrophoresed. At this point, based on the known rate of 
migration of DNA fragments of variant size, a chemiluminescent detector is used to detect 

25 when hybridized restriction fragments chemilumiescently labeled with the rrn probe elute 

from the electrophoretic gel. Given the elution rate will be determined by speed of migration, 
and that migration speed for a fragment of a given size is predictable, the time at which the 
so chemilumiescently labeled hybridized fragment elutes will indicate its size and thus reveal 
the signature bands indicative of one species or another. 

30 

V. Speciation In A Clinical Setting 

The present invention specifically contemplates applying the above-described method 
to medical diagnostic applications. For example, it may be desirable to simply detect the 
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presence or absence of specific pathogens (or pathogenic variants) in a clinical sample. In yet 
another application, it may be disirable to distinguish one species from another. This is a 
process carried out tens of thousands of times daily in clinical medical microbiology 
laboratories in hospitals throughout the world, albeit without the benefit of the present 
5 invention. Indeed, it is the most common diagnostic analysis (test) carried out in the hospital 
clinical microbiology laboratory. 

Identification of a particular species of microbe causing the infection of a particular 
patient is needed in order to decide how to treat the infection, e.g. what type of antibiotic 
should be used since different species (e.g. E. coli versus Pseudomonas aeruginosa versus 
10 Haemophilus influenzae versus Burkholderia cepacia) exhibit different profiles of sensitivity 
versus resistance to the same antibiotic. Likewise speciation may reveal whether there exists 
a pathogen expressing tissue-da m a gin g toxins. 

Currently, speciation is most typically accomplished in the hospital clinical 
microbiology lab using a combination of phenotypic assays involving: (i) a series of 10-15 
15 biochemical tests for nutrients required and substrates metabolized or catabolized by 

microbes); (ii) growth on selective growth media, and (iii) others. At best, results from such 
tests typically take 12 - 24 hours to obtain and sometimes as long as 5 days (by which time 
many an infected patient has expired). Such test decipher the species involved with 
approximately 95% of clinical samples. 
20 The present invention, as noted above, contemplates a non-sequencing approach to 

speciation. This is because an approach involving sequencing (e.g. purification of DNA, PCR 
amplification of the 16S gene of the ribosomal operon followed by DNA sequencing) is 
complex, costly and labor intensive. A sequencing approach is likely to be unsuitable to the 
hospital setting. 

25 That is not to say, however, that sequencing is altogether inappropriate in all settings. 

For example, when the 16S genes of different species are compared (e.g. E. coli versus H. 
influenzae versus Neisseria meningitidis versus Streptococcus pneumoniae versus 
Staphylococcus aureus, etc), greater than 10% - 15% differences in the 16S genes are 
revealed. Given such large differences, it is possible to precisely identify the species of 

30 microbe in which the gene was found based on such sequencing of the 16S gene DNA. 
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EXPERIMENTAL 

The following examples serve to illustrate certain preferred embodiments and aspects 
of the present invention and are not to be construed as limiting the scope thereof. 

In the experimental disclosure which follows, the following abbreviations apply: eq 
5 (equivalents); M (Molar); nM (micromolar); N (Normal); mol (moles); mmol (millimoles); 
jimol (micromoles); nmol (nanomoles); gm (grams); mg (milligrams); fig (micrograms); L 
(liters); ml (milliliters); (il (microliters); cm (centimeters); mm (millimeters); Jim 
(micrometers); nm (nanometers); °C (degrees Centigrade); Ci (Curies); EDTA 
(ethylenediamine-tetracetic acid); PAGE (polyacrylamide gel electrophoresis); bp (base pair); 

10 CPM (counts per minute). 

The present invention is applicable to over 20 other species of bacteria. 
To prepare bacterial DNA, cells were pelleted from 5 ml overnight culture, washed with 
50:20 mM TE buffer [50 mM Tris (pH 8.0), 20 mM EDTA (pH 8.0)) and re-dissolved in 4 
ml 50:2 mM TE buffer (50 mM Tris (pH 8.0), 2 mM EDTA (pH 8.0)]. Cells were first 

15 incubated with 50 /d lysozyme solution (20 mg/ml) at 4° C for 30 minutes and then 

incubated with 50 /d proteinase K (20 mg/ml) and 300 /xl 10% SDS at 55° C for 5 hours. 1 
ml 10% lauroyl sarcosine (acid free) was added to the cell lysate, and the DNA was purified 
by equilibrium centrifugation in a caesium chloride-ethidium bromide gradient 

Restriction fragment length polymorphism (RFLP) associated with multicopy 

20 ribosomal operons was analysed using an rmB probe. For southern blot analysis, the gel was 
transferred to a nitrocellulose membrane using a Bio-Rad vacuum blotting apparatus. DNA 
hybridisation procedure was as follows: After Southern blotting, the membrane was baked at 
80"C for 30 minutes, placed in a heat-sealable bag with 10-50 ml prehybridisation buffer, 
heat-sealed and then incubated at 42°C for 5 minutes. Radio-labelled probe was prepared by 

25 adding: 32 jds DNA (DNA sample was a fragment cut from a LMP Agarose gel, and initially 
boiled for 10 min. before using), 10 /xls OLB, 2 jds BSA, 5 pis 32 P, 2 /ds Klenow. Stock was 
0.5 mCu in 50 ml (5 ml = 50 mCu). The mixture was incubated for ~5 hours or overnight, in 
37°C H 2 0 bath. Before adding the probe to the blotted nitrocellulose membrane it was boiled 
for 10 minutes. Tracking dye was added to the DNA probe before boiling. The labelled 

30 probe was added to the membrane using a syringe. The bag was resealed and incubated at 42 6 
C for 4-24 hours on a shaker. The membrane was washed repeatedly but not allowed to dry. 
Autoradiography was then carried out. 
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EXAMPLE 1 

Conserved, species-specific signature bands: novel genetic markers for 
inter-species discrimination for H. influenzae 

Availability of the complete sequence of the chromosome of the Haemophilus 
influenzae ("Hi") strain Rd allowed us to predict a priori the resultant EcoRI RFLP profile 
generated from the known 6 rrn (ribosomal operon) of this strain. As shown in Figure 1, 
with EcoRI sites occurring once each, in species-conserved 16S and 23S rrn gene sequences 
of each rrn operon, two possible internal fragments (16S-spacer-23S) are generated depending 
on presence or 1 or 2 tRNA sequences within the spacer region between 16S and 23S genes. 
These two conserved EcoRI fragments (1,503 bp and -1,748 bp) are found among all Hi 
isolates. 

Among the >400, putative typable and "NT" (non-typable, i.e. unencapsulated) Hi 
isolates (see Table 1) examined by EcoRI ribotyping (Figure 3), all serotype "a" through "e" 
RFLP profiles and 253 of 31 1 NTHi (non-typable Hi) RFLP profiles contained both signature 
bands. 53 NTHi RFLP profiles lacked both signature bands, whereas four lacked the 1748 bp 
signature band and 1 lacked the 1503 bp signature band. All serotype V RFLP profiles 
lacked both signature bands. These 58 NT and 8 serotype f isolates lacking EcoRI ribotype 
signature bands appear not to be members of the species H. influenzae but appear to be a new 
subspecies or species. 

As described above, all 8 serotype V isolates plus 55 of 58 NTHi Isolates lacking one 
or more species specific EcoRI signature bands appear clustered together in the Figure 3 
dendrogram (the phylogenic tree) as a clearly distinct lineages) from all of the other EcoRI 
signature band-containing isolates, both serotype "a" through "e" and NT. The branches in the 
Figure 3 dendogram are representative of the respective serotypes as follows: 

Type "a" is represented by branches 22-26. 

Type "b" is represented by branches 29-35. 

Type V is represented by branches 50-54. 

Type "d" is represented by branches 22-28. 

Type "e" is represented by branches 58-64. 

Type "f is represented by branches 73-88 (comprising a unique lineage). 
Based on methods known in the art, such as multi-locus enzyme electrophoresis 
(MLEE), this was not revealed in previous phylogenetic analyses of H. influenzae. 
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Preliminary 16S rrn gene sequencing has confirmed that putative Hi isolates missing the 
EcoRI ribotype species-specific signature band(s) appear to have been mistyped as Hi by 
clinical microbiology labs providing these isolates. 



EXAMPLE 2 

Conserved, species-specific signature bands: novel genetic markers for 
inter-species discrimination for E. coli 



An analogous experiment to the H. influenzae Example 1 shown above is performed 
with the species Escherichia coll In this experiment, the computer analysis exemplfied by 
Example 1 for K influenzae, is utilized for the complete genomic sequence of the K coli 
isolate MG1655 [Blattner, F„ Plunkett III, G., Bloch, C, Perna, N., Burland, V., Riley, M. 
The complete genome sequence of Escherichia coli K-12. Science 277 (5331), 1453-1462 
1997]. Roughly 160 independently isolated E. coli strains from diverse geographical locales 
and time periods and sources are analysed (representative data is shown in Figure 6). In this 
case, the conserved EcoRI ribotype RFLP bands indicative of species E. coli were resolved to 
be 2.2 Kb in size. The inventor performed the sequence analyses for all seven (7) ribosomal 
operons (rrwA - rrnH) of the E. coli strain, looking for appropriately conserved restriction 
endonuclease sites, preferably one each in 16S and 23 S RN A genes. A single site for EcoRI 
was found in the 16S region, and also a single EcoRI site was found in the 23S region 
(Figure 5). Sizes of the signature bands of the ribosomal operons in bp are as follows: 

2148 bp (miA); 
2151 bp (rrnB); 
2064 bp (rrnC) 

2149 bp (maD); 
2067 bp (mtE); 
2143 bp (rrnG); 
4476 bp (rroH). 

Knowing the base pair numbers allowed for a priori prediction of the EcoRI ribotype 
RFLP profile of the genomically sequenced E. coli isolate MG1655. Also, this allowed for 
the prediction of the conserved, species specific bands represented by the internal fragments 
between the 16S and 23S EcoRI cut sites (Figure 7). 
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Both the E. coli MG1655 strain and other 168 E. coli isolates were then tested to 
determine the genetic diversity. What was found here is variability in ribotype RFLPs with 
exception of the two conserved EcoRJ bands. These two conserved EcoRI bands make up the 
EcoRI species specific signature. 

Among the 185 putative isolates for this study, some were missing the bands that 
otherwise always clustered around the 2.2 Kb marker (i.e. the 2,065.5 and 2,148 bp bands). 
The isolates were re-typed (re-speciated) by the clinical microbiology lab. In every case, those 
isolates missing the 2 EcoRI RFLP bands proved NOT to be E. coli. 

EXAMPLE 3 

Conserved, species-specific signature bands: novel genetic markers for intra-species 

discrimination for B. cepacia 

An analogous experiment to Examples 1 and 2 shown above is performed with the 
species Burkholderia cepacia. Only in this case, the conserved EcoRI ribotype RFLP bands 
indicative of species B. cepacia were resolved to be 4.2 and 2.6 Kb in size (Figure 8). And, 
as with E. coli, whenever an EcoRI ribotype characterized isolate in this B. cepacia study was 
found to be missing these RFLP bands, and subsequently examined by the clinical 
microbiology lab for speciation, it proved NOT to be in the B. cepacia species. One of Ihese 
mis-typed ^-cepacia isolates is shown in lane 9 of Figure 8. It can be seen here that this 
isolate is missing the predictable B. cepacia species specific EcoRI ribotype bands at 4.2 and 
2.6 Kb in size. This isolate proved to be another species, Xanthomonas maltophilia. 

EXAMPLE 4 

Comparison Of Signature Bands 

In this example, the specific signature bands were compared across the species tested 
in Examples 1, 2 and 3 above. When comparing the signature bands for E. coli versus B. 
cepacia (see Figures 6 and 8) as well as those for Hi verus E coli versus B. cepacia (see 
Figure 3), it is clear that these "signature" bands can be used to distinguish one species from 
another. 
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50 
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53 
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nt 
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N-F176 


nt 
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Table A 



List of restriction enzymes (alphabetical order), cutting H. influenzae Rd rrnA 5 times 
or less, with positions of restriction sites indicated. 







Pcsflton(s) 








Aot 1 1 

C AC6T C 
C TCCA 6 
f 


j 


1 190 


















Aco 111 
TCCCCA 
ACCCGT 


1 : 


365 








Ace 1 

ct rat AC 

CA KH TC 
t 


3 


I2Q 1 1566 


5077 






Aec til 
4 

T CCCC k 
A GGCC T 
t 


2 


1297 3757 








Ac II 


1 


3992 








G CTAG C 
C CATC C 
t 












Ac* Ml ^ 


3 


1074 37B8 


4825 






CAGCTCMMNNNNN * 
GTCCAGJMMNMMN * 


DtMN 
MNM 

t : 










Acr ! 

CTCCRC 

CRCtrc 


5 


1377 2168 


3917 . 


4259 


5143 


Afo24R 1 


1 


3968 








GCCGCC 
CCGCCC 












Afi in 

A CRTC 1 
T CYRC A 
t 


3 


679 1223 


2644 






Aft IV 
ACT ACT 
TCATCA 


2 


652 2716 








*"i 

A CCCC T 
T GCCC A 
t 


1 


400? 








aim r 

i 

CGATCNNNN N 
CCTACNNNN N 

t 


3 


1533 1630 


4646 






AI«N 1 

CAC WW CTC 
CTC MNM CAC 
t 


2 : 


1048 0492 


















ADO 1 

TCCCCA 
ACCGCT 


1 : 


1346 








Aos Ml 
CCCCGG 
GCCGCC 


2 : 


522 4304 








C GGCC C 
C CCGC C 
t 


2 


927 2675 


















C TCGR G 
G RGCt C 
t 


5 


1376 2169 


3918 


4260 


5144 


Asel 

1 

AT TA A7 
TA AT TA 
1 


1 


1890 
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Asp52 I 

AACCTT 
TTCGAA 

AspSH I 
GCATCC 
CCTACG 

A»p78 I 
A6CCCT 
TCCGGA 

At* I 

CCATCG 
GCTACC 

AtuC I 
TCATCA 
ACTAGT 

Avo I 

C tCGR C 
C RGCY C 
t 

Avr I) 

C CTAG C 
G CATC C 
t 



ACMKMNGTATC 
TGNHKNCATRC 



Bel I 

TCC CCA 
ACC GOT 
t 

Ban I 

C*GYRC C 

C CRrc C 

t 



G RGCt C 
C YCGR G 
t 

Bovl 

I 

CAG CTG 
GTC GAC 
t 

Bbf741l l 
TCCGGA 
AGGCCT 

Bbr t 

4 

A ACCT T 
T TCGA A 
I 

Bbs t 

GAAGACHN NMNN 
CTTCTGNM MMKN 



Beg I 



77 2332 1508 
213 
412 

1406 2952 
1 1 

137B 2469 3918 4260 5144 



7B8 4630 

614 4656 

4245 

847 4755 5508 



927 1008 2675 



GC ANKNNNNTCGNMNNMNNN 
CGTNHNNNNACCNHKNNNKN 



1794 4024 

1296 3756 

78 2333 4600 

1548 3567 6278 



3121 



1072 



5153 
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Enzyme 

Beg r 


1 


1038 






NN NNMMMNMNNNGCAMNN : 
NN NMNNNNNMNNCCTKNN : 

t 








Bel 1 
\ 

T CATC A 
A CTAC T 
t 


1 


12 






Bco102 II 
GAACAC 
CTTCTC 


3 


1540 3559 


5270 




Bcol63 1 
CTRYAG 
CAYRTC 


2 


1621 4338 






8co35 1 
CTECAG 
GACCTC 


2 


1 167 298fl 






bcui 
\ 

AC TAG T 
TCATCA 

t 


2 


3363 3369 






Bf 1891 

4 

Y GGCC R 
R CCCG T 
f 


3 


BB9 4166 


4243 




Bfa 1 

C TRY A G 
C AYRT C 
t 


2 


1622 4339 














Bol 1 

4 

CCCM KNN KGGC 
CCGN HHH HCCG 
t 


2 


3151 5149 






61119 1 
GCTCTC 
CCACAG 


3 


JW77 1515 


5434 




Blp 1 


1 


1790 






GCTNAGC 
CCANTCC 










Bm142 1 

4 

RGC GCY 
YCG CGR 
t 


2 


1804 3906 






BaeTI 

TGATCA 
ACTAGT 


1 


11 






Bpt 1 

CAGHNNNNCTC 
CTCNNNNNCAC 

t 

Bp* 1 




1269 1280 


3187 3198 




2 


1189 3006 






C T CG ACMMKNNMHNMNNNMN 
CACC T CNNNNNNNNNNNXNN 








BpulO 1 


2 


I62B 3259 






CC TNA GC 
GG ANT CG 
t 










Bpu1268 1 

CCTNNKNNAGG 
GGANNNNNTCC 


2 


338 »443 






B«o I 

4 

GGTCTCN KNNN 
CCACACN NKNN 


3 


4QBQ 4522 


5429 
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Bio XI 


2 








^ACNNNMNCTCC 
TCMMHMMCACC 










f 












5 


875 682 


q012 


qi69 4566 


B»o0 1 ^ 










cc ry te 










CO YR CC 










t 










B»oA 1 


q 


680 1226 


2647 


2785 


4 

YAC CTR 










RTB CAT 




















B»oG 1 


q 


trtfil 155B 


3123 


1540 


CW&CVC 










CVCCVG 










BftdC I 


1 


2866 






CTTAAC 










CAATTG 










Bsofl 1 


1 : 


3518 
















CAATE CN 




















CTTAC t CN 










B»b 1 


2 > 


1067 2558 






CAACAC 










CTTGTC 










BbcJ 1 


3 : 


1 123 tqoe 


q799 




CCAMWWNNTGG 










G6TNMNNNNACC 










B..59 1 


l : 


1499 






CCTMACC 










CCAHTCG 










Basil 1 


3 


368 q585 


q978 




GCAATG 










CGYTAC 










t 










BseR 1 


3 


2565 4510 


q553 




CAGCAGNNHNNNNN 
C T CC TCNNNNNNNN 


NN 
NN 

t : 
1 : 








Beg 1 


q208 


























BshL 1 


2 


2388 28q9 






GATATC 










CTATAC 










BslHXA 1 


q 


1008 1563 


3128 


4545 


G WGCW C 










C WCGW G 










f 










BsmSl 


1 


6355 






CGTCTCH NHNN 










CCACAGN HMNN^ 










BtnC 1 


1 


13BB 






TGTACA 










ACATGT 










Bsmti 1 


2 


1801 3903 






RGCCCY 










YC6CCR 










BsoO 1 


2 


BBS 9185 






CGCCCC 










GCCGGC 










BsoJI 


» 


3968 






GCCGGC 










CGCCCG 










Bsptl7 I 




227 922 


1001 


2670 


GRCCYC 










CTCGRC 
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_F29_ 



FosmorVs) 



Bap 120 I 

c cccc c 

C CCCC G 
t 

8spl9l 2 

C CATC C 
C 6TAC C 
t 

Bap21 I 3 

GACNNNNNNT CGMMKWWW 
CTGNNNNNNACCMNNNNNN 



ft«p24 r 



NNNNN MMMNMHNHGACNN 
KNNNN NNNNNNWNCTCNN 



B*p8 tl 
CTCAA6 
6ACTTC 

B«p87 I 
CACCTC 
CTGCAC 

BspC I 
CTGGAC 
CACCTG 



T CATC A 
A GTAC T 

8tpJCT5l 



ACCTCCNNNN NMNN 
TGCACGNNNN NMNN 



GCAATC NN 
CGTTAC NN 
t 

BsrE I 
CTCTTC 



BarFI 
4 

R CCGC Y 
r GGCC R 
t 

BtrC I 
\ 

T 6TAC A 
A CATG T 
t 

BtrV I 

GCATC 
CCTAC 

BsaS I 

C TCGT G 
G AGCA C 
t 

8»t 1 107 I 
I 

CTA TAC 
CAT ATG 
\ 

B«t29 t 
CCTNAGG 
GCANTCC 



2671 



2953 



3217 



1191 1508 



145B 1176 



677 



3995 
26M 



9787 



3BQ2 
1223 
3492 



1560 3824 1017 4809 

1521 2950 1230 1619 

376 1593 1970 

3 998 1992 

500 3969 1007 

1387 

1521 1835 1851 
1065 

1212 

3609 11B« 
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Page 6 



Frag 



B.t£^l 

C CTNAC C 
C CANTC^G 

B»1HPl^ 

CTT A AC 
CAA TTC 
t 

BstX I j 

CCAN MMMN HTGG 
CCTN MMHN MACC 
t 

B»tZ2 I 

^GACNNNNNCTC 

CTGKHXKKCAC 



B*u36 t 

CC TMA CG 
CC AMI CC 
t 

CfrtO I 

R CCCC T 

r cccc^r 

C(rt« I 
YCCCCR 
RCCGGT 

Cfr9l 

C CCGG G 
G CGCC^C 

CfrJ4l 

CCC CCG 
CCC CCC 
t 

Chu II 
CTYRAC 
CARYTC 



TTT AAA 
AAA TTT 
t 



CACMN MM KNGTC 
CTGNN MM MMCAG 
t 

Drd II 

GAACCA 
CTTGGT 

Oso VI 
GTKXAC 
CAKtlTC 

toe I 

Y GCCC R 
R CCCC Y 
t 

^ I ^ 

CTCTTCM NMM 



G GTNAC C 
C CAMTC G 
t 

Eel I 

TCCGCC 
ACGCGG 



Postttonfs) 



1500 



1131 1414 



1196 505 1 5062 



3611 HB6 



500 3969 4007 



B88 4165 4242 



1378 



1380 



2B66 



1965 2028 3344 4821 



A620 4888 

1668 2713 3088 

1239 1584 5075 

889 4168 4243 
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Page? 



ECU 1 
TACCTA 
ATCCAT 


1 


27 B 2 






EC IE I 
CCCCCC 

cccccc 


2 


922 


2670 




Eel 137 1 
CACCTC 
CTCCAC 


1 


tooi 






EclHK 1 . 


2 


1191 


5057 




CACKN N KVCTC 
CTCKH N KMC AG 
t 










Ecp24I | 

C RGCY C 
C TCGR C 
t 


4 


232 


927 


1006 










EcoSII 


3 


4484 


4522 


5429 


GCTCTCH NNNN 
CCAGAGN HMNN 

1 











EcoSO I 
GGYRCC 
CCHTCG 

Eco52 I 

C CCCC G 
C CCGG C 
! 

Eco57 ( 



CTCAACMI 
CACTTCMI 



Eco64l 2 
* 

G GYRC C 
C CRTC G 
t 

Eco72 I 2 
* 

CAC GTG 
CTC CAC 
t 

Eco82 I S 
GAATTC 
CTTAAC 

EcoBAI i 
I 

C TCGR G 
C RGCY C 
t 

EcoO I < 

7TANNNNMNNGTCT 
AATNNNNNNNCAGR 

t 

EcoO XXI 
I 

1CANNNNNNNRTTC 
ACTNNNNNMNYAAG 



EcoOR2 J 
I 

TCANMMXNMGTCG 
AGTNNNNNNCAGC 

t 

EcoE I 
I 

GAGNNHHNNNATGC 
C7CMNMNMMNTACG 



EcoJCR I 
* 

GAG CTC 
CTC GAG 
t 



eqe 4754 5507 



889 4166 



1560 3824 4017 4809 

847 4755 5508 

680 1226 2647 

671 2418 

1378 2469 3918 4260 5144 

118 1609 1896 3262 

1370 4242 5459 



205 
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Page 8 



Eigywe . 

EeoN 1 

CCTMN M NNAC6 
GGANN N^HMTCC 

EcoOI09 I 

R6 CMC CY 
TC CNG GR 
t 

EcoPI5 1 



EcoR I 

C AATT C 

r ?TAA € 
t 

EcoR V 
• I 
GAT ATC 
CTA TAG 
t 

EcoR 124 I 

GAANNNNNNRTCG 
CTTNMMXNMYACC 



ECOR 12* 11 



CTTMNNNNNNTAGC 



EcoR02 

GAANNNNMNRTTC 
CTTNNNNNNYAAG 



EcoVIM 3 

A AGCT T 
T TCGA A 
t 

Ecoprr 1 

CCAHMNNMNXRTGC 
GGTNNNNNMNYACC 

T 

Espie i 

CGTCTC 
CCACAG 

Esp3 I 

CCTCTCN NNNN 
GCAGAGN NNNN 



Fbll 

CT roc AC 
CA KR TC 

I 

TGC CCA 
ACC CCT 
t 

GACNNNGTC 
CTCNNNCAG 

Cdi II 

4 

CCGCC R 
GCCGG t 

t 



Op 



CACCTC 
GTCGAC 



343 1H8 



923 267* 



672 2419 



2371 2652 



1326 4272 



1350 3623 4135 4195 



3542 5288 



78 2333 «600 



5360 



5355 



1241 «586 5077 



323 »12B 4484 

893 894 4170 4171 

1791 4021 
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Efggng 



P8089 



Sto Sunwmy by Duyme 



wcc ccw 

VCC CCV 

t 



R CCCC T 
Y CCCC R 
t 



C AATT C 
C TTAA C 
t 



CACCCNWOW KWMH 
CTCCCNNNNN NKNNN 



Hoi CI 
I 

G CYRC C 
C CRYC C 
1 

HglE tl 

ACCKMNNMXCCT 
TCCNNMNNNCCA 

Htn8 I 
CRCCYC 
CYCCRC 

HinJCI 

I 

CTY RAC 
CAR YTC 
t 

HIAC II 

CTY RAC 
CAR YTC 
t 

Hind 111 

4 

A ACCT T 
T TCGA A 
t 

HP.., 

CTT AAC 
CAA TTC 
t 

Hsp92 I 

CR CG YC 
CY CC RG 
t 

I tp I 270 I 
RCATGY 
YGTACR 



CATATC 
C TAT AC 

niuiioe i 

RCCVCCT 
YCCVC6R 

ntuua i 

CC GC CG 
GG CG CC 
t 

rue i 

I 

TGG CCA 
ACC CGT 
t 



* 

CAYNN MKRTC 
GTRHN NMYAC 
t 



415 3047 4245 



1806 3908 



672 2119 



759 4411 4944 



B47 4755 5508 



3110 3871 



1185 4401 



78 2333 4600 



1 187 4403 

50 213 942 

2388 2649 

3261 4794 

524 4306 



1412 4069 4239 »76» 
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FreQ 










Ertzvme 
Msp20 t 


2 


4242 4248 








TGGCCA 
ACCGCT 

f 












HspAl 1 


5 


525 1794 


1829 


4024 


4307 


CMC CKG 
CKC CMC 
t 

Hoe 1 


1 


397 1 








* 

GCC GGC 
CCG CCG 
t 












NCO 1 


2 


1407 2953 








C CATC G 
G GTAC^C 












C CCGC C 
C CGCC^G 


1 


3969 








Nhe 1 

G CTAG C 
C CATC G 
t 


) : 


3988 








Nl 1387/7 1 
1 

C YC6R G 
G RGCY C 
t 


5 


1382 2473 


3922 


4264 


5148 


Nru 1 

4 

TCC CCA 

acc ccr 
t 


1 : 
3 


1349 

55 2 IB 


947 






Hi p 1 ^ 

R CATC Y 
Y CTAC R 
t 












pfinoa i 

TCCTAG 
AGCATC 


2 


3173 6080 








PinAI 

A CCGG T 
T GCCC A 
t 

Ppel 


1 

2 


4007 
927 2675 








t 

G GCCC C 
C CCGG G 
t 












Ppu 1 253 1 
GACGTC 
CTGCAG 




1185 








Ppu6 1 


4 


677 1223 


2644 


2782 




VACCTR 
RTGCAY 












Pputl 1 


2 


3263 4796 








RG CWC CY 
YC CWC GR 
t 












PihA 1 

GACNN NNGTC 
CTGNN NKCAC 
t 


1 


4406 








Pip 1406 1 

AA^CC TT 
TT CC AA 
t 


2 


35B2 4445 
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Enzyme 

PtpAl 

C CCGC C 

c gccc c 

t 


1 


1378 










Pas 1 

4 

RC CMC CT 
TC CNG CR 
t 


4 


026 2670 


3266 4709 


Pvu II 

CAG CTC 
GTC GAC 
t 


2 


179* 402* 










Rhc ! 

TCATCA 
AGTACT 


1 


1473 




RleA 1 


2 

4 i 

NMM 
NNM 

T 

1 


3670 54 72 




CCCACANNlWNliHNN 
GCCTCTNNWHHWm* 






Soc 1 

4 

C A6CT C 
C TCGA C 
t 


1006 










Soc M 

* 

CC CC CG 
GG CG CC 
t 


2 


526 1308 




Sop. 4 

CCTCTTCM MHN 
CGAGAAGM NNH 

♦ 


1 


094 










SouLPI 

CCC GCC 
CCC CCC 
t 


1 


3071 










Sco 1 

4 

ACT ACT 
TCA TGA 
t 


2 


655 2721 


• 


Sfc 1 

C TRTA G 
C AYRT C 
t 


2 


1B22 4339 










SorA 1 
4 

CR CCG6 YC 
GY 6GCC RC 
t 


1 


3969 










Sw 1 

4 

CCC CCC 
CCC CCC 

t 


1 : 


1360 




Sail 

1 

C TTRA g 
C ARTT C 
t 


3 


3100 3908 


5168 


Sno 1 

GTATAC 
CATATC 


1 


1239 




SnoB 1 

4 

TAC GTA 
ATC CAT 
t 


1 


2765 




Spe 1 

A CTAC T 
T CATC A 
t 


1 


3364 
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Bph 1 ^ 1 

C CATG C 
C 6TAC C 
t 


r 

218 
























S>ol 2 

C AATT C 
C TTAA C 
t 


672 


2419 






















Sap 1 ^ 3 

AAT ATT 
TTA TAA 

t 


363 


1928 


3654 




















Sill I 
G AGCT C 

C^TCGA C | 


1006 
























i 

Stu 1 1 

4 

AGC CCT 
TCC GG* 
■ t 


QI5 












StySJ 1 


1QI2 












GACKNKMNKCTRC 
CTCKNHNHNCATG 

t 














StySKI 1 

C6ATNNNNNNNCTTA 
CCTANNNNNNNCAAT 

t : 


873 
























StySP 1 1 j 

AAC MMNNNNCTRC 
TTf uiMnnni p a tc 

t ; 


2155 
























Syr II 3 
6AANNNNTTC 
C7TNNMNAAG 


410 


1368 


3430 




















Toq II 3 | ' 
GACCGANNNNNMHMN NN 

t 


2720 


3007 


4856 




















Toq II » j : 

CACCCANNNNNNNNN NN 

PTyuutniMuuii ttu 
GTCGCTNHPfWlffnN" nf» 


5088 
























mm i 3 

CACN N NGTC 
CTCN N NCAC 
t 


327 


1132 


4488 




















TtMII II 3 : 

CAARCANNNNMNNNN NN. 
GT T T G T NNNMWf "WW ^ Hit. 


109 


2647 


4740 




















Ubol220 1 t 

CCCCGC : 
CGCCCC 


1377 












Ubo122l 1 2 
CCTNACC 

CCANTCC : 

1 


1788 


1795 






















Ubol303 1 5 
CGRYCC 

CCtRCC : 


871 


888 


4008 


4165 


4562 




0bol326 1 4 
RCCNCCY 
TCCNCGR 


921 


2669 


3261 


4794 






Ubot382 I 1 
GAATCC 
CTTACC 


3511 
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Enzyme 


Freq 


ft 


ttttonfs) 




»em9l 1 

CCAM MUM MTCC 
CCTN MKM MACC 
t 


1 


2612 






HP, 

AT TA AT 
TA AT TA 
t 


1 


1890 
















1 


1164 






CCAMMMN N MMNNTCC 
eCTMKNM M MMMMACC 
t 








Xma 1 

C CCCC G 
C CCCC c 

t 


I 


1378 














X BO III 

* 

C CCCC c 
C CCCC c 

t 


2 


889 


QI68 




CAANM HMTTC 
CTTKM MNAAG 
t 


3 


415 


1373 


3435 
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i ict of restriction enzymes (alphabetical order), cutting H. Influenzae Rd rmB 5 
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Aol I! ^ 

C ACCT C 
C TCCA C 
t 

Act) I 

TTCGAA 
AAGCT1 

Aeo til 
TCCCCA 
ACCCCT 



CT UK AC 
CA KM TC 

t 

ACC lit 

T cccc A 

A GGCC T 
t 

Aca II 

C CTA6 C 
C CATC C 
t 

Ace 111 



CACCTCNNNNHMN NNNN 

CTC cagnmxnwmn nmmn 



CYCCRC 
CRCCVC 



R AATT t 
Y TTAA^R 

MoMR 1 
CCCCCC 

cggccc 

Aft III 

A CRT6 T 
T CtRC^A 

AM IV 
ACTACT 
TCATCA 

kge I 

A CCCC T 
T CCCC A 
t 



CCATCNNNW H 
CCTACRHWI H 



CAG HUM CT6 
CTC WW CAC 
t 



TC6CCA 
AGCCCT 

IB III 

CCGCGG 
CCCCCC 



C GGCC C 
C CCGG G 



1 169 



1721 



1585 «B31 



1296 3511 



1073 3512 11579 

1376 2222 3671 «0I3 1B97 

671 2173 3185 3951 41 15 

3722 

678 1222 2398 

651 2472 
3761 

1532 "00 

1045 4246 

1315 

521 1058 

926 2429 
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Posfttonfs) 






Emvnw 

Apo I 

R AATT Y 
Y TTAA R 
1 


S 


B7I 2173 


3185 


3951 4115 










C TCCR G 
C RCCY C 
t 


5 


1377 2223 


3672 


4014 4888 


Asp52 I 
AAGCT7 
TTCCAA 


3 


76 2086 


4353 




AspSH 1 
CCATGC 
CGTAC6 


1 


212 






Atp7B 1 
AGCCCT 
TCCGGA 


2 


411 1683 






Ate 1 

CCATGG 
GOT AC C 


2 


1 106 2706 






AluC 1 
TGATCA 
ACTAGT 


1 


10 






Avo 1 

C*YCCR C 
C RCCY C 
t 


5 


1377 2223 


3872 


4014 4888 










Avr II 

C CTAC C 
C CATC C 
t 


2 


620 I6B7 






Bo* 1 

ACNNNNGTAYC 
TGHNNKCATRC 


2 


767 4384 






Boe 1 


2 


613 4410 






KNNNNNMNNNMNNNNACNNN : 








Bol i 

4 

TCG CCA 
ACC CCT 
t 


1 


3999 






Bon | 

G GYRC C 
C CRYG G 
t 


3 


646 4509 


5262 




Bon II 

G RCCY C 
C YCCT G 
t 


4 


231 926 


1005 


2429 


Bovl 

i 

CAC CTG 
GTC GAC 
t 


1 


377B 




- 


Bbf74M 1 
TCCGGA 
ACCCCT 


2 


1295 3510 






Bbr 1 

J 

A AGO T 
T TCGA A 
t 


3 


77 2087 


4354 




Bbs 1 

* 

CAAGACMN HNNN 
CTTCTCNN NMNN 


3 

t 


1597 3321 


5032 




Bco 1 

cccc 

CGCG 


5 


365 1 100 


3134 


3858 3720 
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Enzyme _ 

BceB3 



CTTCfcCNNWWMNNHHNMMN 

caIctchhnnwmhhnmmkmn 



Beg I 



CC AMNMNNNTC CHNMMKNHN 
CCTNNMWNNACCMMKNKMHH 



Beg r 



t 

Bet, 

7 CMC A 

Bcol02 II 
CAACAC 
CTTCTC 

Bcol63 I 
CTRYAG 
CATRTC 

Bco35 t 
CTCCAC 
CACCTC 



ACTACT 
TCATCA 

BflBBI 

Y^CCCC R 
R CCCC^T 

B<> ! 

c*trya e 

C AYRT^C 
Bgi 1 

6CCN MMN KCCC 
CCCM^NMN NCC6 

Bl 149 I 
GCTCTC 
CCA6AG 

Bm112 I 

RCC CCY 

ycb^ccr 

BmTI 

TCATCA 
ACTACT 

Bp I 1 

CACHHNNNCTC 
CTCHNKMHCAG 



1 

A] 
CTNMN 

1 



t 



CTCCACNNNMWMNMHMHWNN 
EaCCTCNNJWMHHHNMMNNN 



BpulO^I 

CC TMA CC 
CC AHT^CC 

Bpol26fi I 

CCTNNMNMACC 
GGANNNKNTCC 



Sw Summary by Enryme 

PosMonM 

2875 3883 1907 



1071 



II 

1539 3313 502* 
A092 

M68 2738 

3117 3123 

888 3920 3997 



2905 1903 



0231 1269 5188 



1288 



1279 2911 2952 



t IBB 2760 



3013 



337 1112 
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mSUap (1 >5273) SOSCUSCbs SaSummaiy by Enzyme 
Enzyma Fiaq 



-' » 

CCTCTCH WOW 
CCACAGM MNN^ 

6»o^II 

ACNNNWNCTCC 
TGMKMHNCACC 

t 

BioO I ^ 

C6 RT CC 
CC TR GC 
t 

BsoA I 

YAC CTR 
RTC CAt 
t 

BtoC I 

GMCCVC 
CVCGVG 

Bs<* 1 
CTTAAC 
CAATTC 

Bsofl t 

GAATC CN 
CTTAC CN 
t 

Bsb I 

CAACAC 
67TCTC 

BscJ I 

CCAN1MMMNTGC 
CCTNHNNNNACC 

Bs«59 I 
CCTNACC 
CCAMTCC 

Bseflt 

CCAATG 
CGTTAC 

t 

BseR I 



H22S 4278 5IB3 

2 2316 3563 

5 874 891 3786 3923 4320 

q 679 1225 2401 2539 

q 1000 1557 2877 4294 

1 2620 

1 3272 

2 1066 2312 

3 M22 1«»05 4553 

2 1198 1710 

3 367 1339 4730 

3 2319 1261 1307 



CTCCTCMMMXHHMN^MN 



CTGCAGNMMNK* 
CACCTCNMJWM* 



BshL I 
CATATC 
CTATAC 

BslKKA I 

G VCCV C 
C VCGV G 
t 

BsbBI 

CGTCTCN NNNN 
GCACAGH HHHU^ 

BsmC I 
TGTACA 
ACATGT 

BtaK I 

RGCGCT 
YCGCCR 

BsoO I 
CGGCCC 
CCCCGC 



2122 2403 

1005 1562 2882 1299 

5109 

1385 
3857 
687 3919 
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EZ29— 



B»oJI 

CCCGGC 

cceccG 

BtplW I 11 
GRGCTC 
CYCGRC 

a* P i20 i 2 

C CGCC C 
C CCGG^G 

B»pl9l 2 

C CATG G 
C GTAC^C 

B«p24 1 3 
CACMNMHHNIGGNNMMIWN 

CTGKKKJ«HACCKHKfW«:i 4 

B*p24 I , 3 



Sfie Summary by Eroyroi 
PosntofKs) 



HNNMN NHMMNMNHCTGNN 



Bap6 II 4 
CTCAAC 
GACTTC 

B«p87 I 3 
CACGTC 
GTGCAC 

BipG I 2 
CTCGAC 
GACCTG 

BspM t ' 

T CATC A 
A CTAC^T 

B0pKT5! " 

CT6AAGHMHKNMMNHMMMHN 
CACTTCMNNNWHMMMNMNMN 



BspLUtl II 

TCTACA 
AGATCT 



Bspfl I 



ACCTGCMNHM MMNN 
TCGACCMMMH MHMN 



B»rO I 

CCAATC HN 
CGTTAC HN 
t 

BtrE I 

ctchc 

CACAAC 

BsrF I 

R CCGG t 
T CGCC R 
t 

B*rC I 

4 

I G1AC A 
A CATGUT 

BsrV I 
GGATC 
CCTAG 



3722 

226 92 t 1000 2424 

922 2425 

1006 2707 

3001 4245 4262 

3033 4213 «30 

1537 3556 3749 4541 

676 1222 2398 

328 3246 
1473 

1559 3578 3771 4563 

1678 1684 

15 23 2704 3984 4373 

375 4347 4724 

2 897 4748 

499 3723 3761 



1523 1405 
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Enzyme 



Freq 



Slta Summary by Enzyme 
PcsIUonjs) 



C TCGT C 
S AGCA C 
t 

Bst1107 I 

GTA 7 AC 
CAT ATC 
t 

Bst29 1 
CCTNACC 

ccAirrcc 
aste ii 

G CTNAC C 
C CANTG G 
t 

BltHPI 

* 

GTT AAC 
CAA TTG 
t 

B»tX I 

4 

CCAN NHMH NTCG 
GGTN NMNN KACC 
t 

B*tZ2 I 

^GACNNNMNCTC 
CTGNNNNNCAC 

t 



CC TMA CC 
GG ANT CC 
1 

Cfol 

C CC c 
C CC c 

t 

ClrlO I 

R CCCG T 
Y CCCC R 
t 

Cfrl4 I 
YGGCCR 
RCCGGT 

Cir91 
I 

C CCGC C 
G GGCC C 
t 

CfrJQI 

* 

CCC GGG 
GGG CCC 
t 

ChM II 
GTYRAC 
CA*YTC 

TT CC AA 
AA CC TT 
t 

Dro I 

TTT AAA 



CACNN NN NKCTC 
CTGNN NN NNCAG 



1064 



3383 3838 



1499 1711 



2623 



1130 1413 4561 



1164 1195 4B05 4816 



3365 3940 



368 1103 3137 386 1 3723 



499 3723 3761 



B87 3919 3996 



1782 3098 4578 



4374 4620 
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EjggHS 



SfteSummeiy by Enzyme 



Positionfs) 



Drd II 
CAACCA 
CTTCGT 

Oao V! 

CTHKAC 
CAJCftTC 

Eoe 1 

T*CCCC R 
R CC66 Y 
t 

Eor 1 ^ 
CTCTTCM NNN 



t 



Ecol 

G CTNAC C 
C CAMTG G 
t 

Ed 1 
TCCGCC 
AGCCGG 

EclA I 
TACCTA 
ATCCAT 

Ec IE I 
GGGCCC 
CCC6GG 

Eel 137 I 
GACCTC 
CTC6A6 

CclHK I ^ 

GACNN N MNGTC 
CTGMM^H NNCAC 

Eco24l 

G RCCY C 
C YCGR G 
t 



Eco3ll 



GGTCTCN MNNN 
CCACAGN MHNN 



EcoSO I 
GGYRCC 
CCRYGG 

Eco52 I 
I 

c cccc c 

G CCGG C 
t 



Eco57 I 



GACTTCNNNNNNNNNNNNMN 



Eco64l 

G GYRC C 
C CRYC C 
f 

Eco72 I 
* 

CAC GTG 
GTG CAC 
t 

EcoB2 I 
CAATTC 
CTTAAC 

EcoBB! 

C YCGR G 
G RCCY C 
t 



2487 2822 4412 

1238 1583 1829 

888 3920 3997 

993 4742 

1499 1711 

100 
2538 

921 2424 
1000 

1190 4811 

231 926 1005 2429 

4238 4276 5183 

815 4508 5261 

888 3920 

1559 3578 3771 4563 
B46 4509 5262 
679 1225 2401 
670 2172 

1377 2223 3672 4014 
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Sits Surnrafy by Enzyme 



Emvmg 



Frag 



EcoO 
4 



TTANMNNNHHCTCY 
AATMNNNMNNC ACR 



EcoO XXI 

TCAMNMNMNNRTTC 
A6TNNMKMNNYAA6 



£coOH2 

ScAMMMNMNGTCC 
AGTNMNNNNCACC 



EcoE 



CAGNNNKNNNA T CC 
CTCNNNNMNNTACG 



EcolCR I 

6AC CU 
CTC GAG 
1 

EcoH I 

CCTHN K NNAGC 
GGANN N MMTCC 
t 

EcoO 109 I 

RC CMC CY 
YC CKC CR 
t 

EeoPIS I 



EcoR I 
I 

G AAT7 C 
C TTAA C 
t 

EcoR V 

I 

CAT ATC 
CTA TAG 
t 

EcoR 124 I 

CAAKNNNMNRTCG 
CTTNNNHNHTACC 



EcoR 124 II ' 

GAANNMNNNNRTCG 
CTTNNNNNNNYAGC 



EcoR02 

GAANNMNNNRTTC 
CTTNMMNNNYAAG 



EcoVIM 
4 

A ACCT T 
T TCGA A 
t 



Ecopi 



rr I 



CCANNNNNNNRTGC 
GGTNNNNNNNYACG 



Eapt6 I 
CCTCTC 
CCAGAC 



117 1757 3016 



1369 3996 



1003 



312 1447 



022 2125 3017 



379 546 



671 2173 



2125 2406 



1325 4026 



1349 3377 



3296 5040 



77 2087 4354 
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CCTCTCM L 

GCACACN MNHW 



Fbtl 

CI HK AC 
CA KtyC 

Fap I j 

ICC CCA 
ACC CGT 
t 

F»u I 

CACNMMCTC 
CT6KNNCA6 

CCCCC R 
GCCG6 Y 

t 

tap I 

CACCTG 
GTCGAC 

VCC ccw 
wcc ccw 



* I 

R CCCC T 
Y CCCC R 
t 



C^AATT C 
C TTAA^C 

H9«CI 

G GVRC C 
C CRTG^G 

Holt II 

ACCMWOUWCGT 



Hho I ^ 

G CC C 
C CC C 
t 

HfftS I 
CRCGYC 
CTCCRC 

HinJCI 

CTY RAC 
CAR YTC 
t 

HlnPI I 

G CC C 
C CC C 
t 

Hmc II 
i 

CTY RAC 
CAR YTC 
t 

Hind til 

A*ACCT T 
T TCGA A 
t 



1240 I5B5 4831 



387 



322 1 127 4238 

B9? 893 3024 3025 

3775 

q,q 1686 2801 3009 



871 2173 

BQ8 4509 5262 

2864 3625 

368 M03 3137 3681 3723 

1164 1737 4155 
2623 

368 MOI 3135 3659 3721 



77 2087 4354 
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SAo Summary by Enzyme 



CTT AAC 
CAA TTG 
t 


I 


2623 


















H*p92 1 


3 


1186 1739 


4157 






CR CC TC 
CT CC RC 
t 












Lap 1270 1 
RCATCY 
YGTACR 


3 


49 212 


941 






rt SooOoo 
CATATC 
CTATAC 


2 


2122 2403 








niuiioe i 

RCCWCCV 
YCCVGCR 


2 


3015 4548 








niuii3 i 

4 

CC CC CC 
CC CC CC 

1 


2 


523 4060 








fed 

TGG CCA 
ACC 6CT 
t 


1 


3999 








mi i 

CAYNN NMRTC 
GTRHN NNYAC 
t 


4 


1411 3823 


3993 


4615 














rup20 i 

T6GCCA 
ACCCCT 

t 


2 


3996 4002 








ItepAl I 

4 

CHC CKG 

gkc cnc 

t 


3 


524 3778 


4081 
















Noe I 

* 

CCC CGC 
CCG CCC 
t 


1 


3725 








NCO I 
4 

C CATC C 
G GTAC C 
t 


2 


1406 2707 








NgoM 1 
* 

G CCCC C 
C GCCC C 
t 


1 


3723 








Nb« 1 

4 

G CTAC C 
C CATC C 
t 


1 


3742 








. Nl 1387/7 I 
4 

C YCCR G 
G RCCY C 
T 


5 


1381 2227 


3876 


4018 


4902 



4 

TCC CCA 
ACC CCT 
t 



Nsp I 



R CATC V 
t GTAC R 
t 



54 217 
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Eniywu.. 



Pfl!108 I 
TCCTAC 
AGCATC 

PlnAl 

A CCC6 T 
T CCCC A 
t 

PP.. ^ 

6 CCCC C 
C CCCC c 

1 

Ppvi 1253 t 
CACCTC 
CTCCAC 

Ppu6 I 
YACCTR 
RTCCAY 

Ppuft I 

RC CWC CY 
YC CUC^CR 

PsKA I 

GAXNN NNCTC 
CTGNN NNCAC 
t 

Psp1406 1 

AA CG TT 
TT CC AA 
t 

P«pAI 

C CCCC C 
G CCCC C 
1 

P "' » 

RG CMC CY 
YC CMC 6R 
t 

Pvu II 

CAC CTG 
GTC CAC 
t 



TCAT6A 
ACT ACT 



CCCACAWWOWNNMN NNH 
CGC 7 6T NWHMNNMMN NMN 



Soc 1 

G AGCT C 
C TCGA C 
t 



Soc 



t 

CC CC GG 
GG CG CC 
1 



Sop 1 



GCYCTYCH RNN 
CGAGAAGN MNN 



SouLPI 

CCC CCC 
CCG CCG 
t 



2927 4834 



3761 



926 2029 



1184 



Z1Z 1222 239S 2536 

3017 1550 

4160 

1614 1628 3336 4199 



925 2428 3020 



3424 5226 



1005 



525 4062 



3725 
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Site Sunmary by Enzyme 
Posfflonfs) 



Sco I 

ACT ACT 
TCA TEA 
t 

Sic I 

C TRYA C 
C AYBT C 
t 

Sqrk I 

CR CCGC Y6 
CT CCCC^RC 



Sin I 



CCCTC 
CCCAG 



ccc 



C TTRA C 
C ARYT C 
t 

Sno 1 

6TATAC 
CATATC 

Snofi I 

TAC CTA 
ATC CAT 
t 

SP.J 

A CTAG T 
T CATC A 
t 

C CATC C 
C CTAC C 
t 



654 2475 



S»ol 2 

C^AATT C 
C TTAA G 
t 

Ssp I j 2 

AAT ATT 
TTA TAA 

• t 

Sstl 1 
\ 

G ACCT C 
C TCCA G 
t 

Stu I 2 

ACG CCT 
TCC CCA 
1 

StySJ I 

^GACMNMNNKGTRC 
CTCNNNNNNCAYC 

t 

StySX.I 

CCATNNNNNNNCTTA 
GCTANNMNMHNCAAT 

t 



2597 2689 4080 5184 



2854 3862 4922 

1238 
2539 

3118 

217 



671 2173 



362 3408 



414 1686 
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StySP I 

AACNNMNNNCTRC 
TTCMWMHNHCATG 

t 

Syn II 

cttnSSIIc 



TcqU ^ 

CACCCAMNNNNNNNH NN 
rTCCCTMNMHMMNNN NN 



Toq II 



CACCCAl 

CTCCCTMNHMMMMNM NN 



Tthlll I 

CACN N NGTC 
CTCN N NCAG 
t 



Tthlll II 



CAARCAMNNNMNNNN MM. 
CTTYCTMNNWNNNNN NN. 



Ubol220 I 
CCCGGC 

cccccc 

Ubo1303 I 
CCRTCC 
6CTRCC 

Ubo1326 I 
RCCNCCY 
TCCNCCR 

UDOI382 I 
CAATGC 
CTTACC 

Von91 I 

CCAN NMN NTCC 
CCTN NMN NACC 
t 

Xba I 

1 CTAC * 
A CATC T 



CCANNNN N NNNNT6C 

i n r 



C CCG6 G 

c cccc c 

t 

Mill 

C GCCC G 
G CCGC C 



CAANN NNTTC 
CTTNN NMAAG 
t 



409 1367 3164 
2«7fl 2761 «10 

4842 

326 U3I 4242 
106 210 t 4494 
1376 

670 687 3762 3919 4316 
920 2423 3015 4546 

3265 

2366 

1679 
1163 
1377 

B88 3920 

Did 1372 3169 
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1. A method for bacterial speciation, comprising: 

i) isolating bacterial DNA from a sample, said DNA comprising DNA encoding 
16S and 23S rRNA; 

ii) digesting said isolated DNA with one or more restriction enzymes under 
conditions such that restriction fragments are produced, said restriction fragments comprising 
a first digestion product of said DNA encoding 16S and 23 S rRNA, said first digestion 
product comprising at least a portion of said DNA encoding 16S rRNA and at least a portion 
of said DNA encoding 23 S rRNA; 

iii) separating said restriction fragments; 

iv) detecting of said first digestion product; and 

v) comparing said digestion product with signature bands of one or more bacterial 
species. 

2. The method of Claim 1, wherein said detecting comprising reacting a probe with said 
digestion product under conditions such that said probe hybridizes to siad first digestion 
product. 

3. A method for bacterial speciation, comprising: 

i) isolating bacterial DNA from a sample, said DNA comprising DNA encoding 
16S and 23S rRNA; 

ii) digesting said isolated DNA with one or more restriction enzymes under 
conditions such that restriction fragments are produced, said restriction fragments comprising 
first and second digestion products of said DNA encoding 16S and 23 S rRNA, said first 
digestion product being larger than said second digestion product, and comprising at least a 
portion of said DNA encoding 16S rRNA and at least a portion of said DNA encoding 23S 
rRNA; 

iii) separating of said restriction fragments; 

iv) detecting said first and second digestion products; and 

v) comparing said digestion products with signature bands of one or more 
bacterial species. 
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4. A method for bacterial speciation, comprising: 

a) providing i) a first biological sample comprising bacterial DNA from a known 
bacterial species, and ii) a second biological sample comprising bacterial DNA from a 
bacterium whose species is unknown; 

b) isolating i) a first preparation of bacterial DNA from said first sample and ii) a 
second preparation of bacterial DNA from said second sample, said DNA of said first and 
second preparations comprising DNA encoding 16S and 23 S rRNA; 

c) digesting, in any order, i) said first preparation of isolated DNA with one or 
more restriction enzymes under conditions such that a first preparation of restriction 
fragments are produced, said first preparation of restriction fragments comprising a first 
digestion product, said first digestion product comprising at least a portion of said DNA 
encoding 16S rRNA and at least a portion of said DNA encoding 23S rRNA, and ii) said 
second preparation of isolated DNA with one or more restriction enzymes under conditions 
such that a second preparation of restriction fragments are produced, said second preparation 
of restriction fragments comprising a second digestion product, said second digestion product 
comprising at least a portion of said DNA encoding 16S rRNA and at least a portion of said 

DNA encoding 23S rRNA; 

d) separating, in any order, i) said restriction fragments from said first 
preparation, and ii) said restriction fragments from said second preparation; and 

e) comparing of said first and second digestion products. 
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EcoRI ribotype RFLPs of H. influenzae isolates from diverse sources, 
including the genomically sequenced strain Rd 




1525 
1503 



serotype-Rd 



Rd a b c d Rd e f NT 1 NT 2 NT 3 Ap Ec 



LEGEND. , „ 

Lettmost portion of the figure depicts the predictable EcoRI ribotype RFLP 
profile of the genomically sequenced H. influenzae strain Rd. Actual RFLP 
profiles of this strain appears in lanes 1 and 6 (so labeled). Other H influenzae 
isolates, as indicated, are serotypes a, c, c, d, e, f and NT (non-typable, i.e. 
unencapsulated). EcoRI ribotype RFLPs of 2 non-H. influenzae isolates are 
shown in the farthestmost right lanes, that for A. Pl^rone u mon,a j^P) and £ 
coli (Ec). Conserved, species specific H. influenzae EcoRI ribotype bands can 
be seen at 1,748 and 1,503 bp (base pairs). These 2 fragments are absent from 
the type f isolate and one NT (NT,), both of which have subsequently been 
found by 16S rrn gene sequencing to actually not be members of the spec.es H. 
influenzae. 
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EcoRI Ribotype-based Phylogenic Tree of A Diverse Collection 
of H. influenzae isolates (types a - f , and non-typable) 
from Variant Clinical and Environmental 
Sources and Geographical Locales 

(unpublished results) 




LEGEND. 

Phytogeny displayed here 
for >400 independent 
isolates has been 
independently confirmed 
based on serotyping, 
capsule operon gene 
polymorphism analysis, 
and comparative DNA 
sequencing of 16S rrn and 
neutral genes. 
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qua seauence and restriction map of the H. influenzae Rd rmA operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less. 



1 m ■ \ \ W ill 

C ATCCCTC ACAT T6 AACCC7 GCCCGC AGCC TT AAC AC AT6C AACT CGA ACCCT ACC ACGACAAACC T TCC TTTC TTCC TGACCACT C6C CCACCCCT CA6T AATCC T TCC ^ 



ATTGAA6AGTTTGAT! 



TAAC1TCTCAAACTACTACCGACTCTAACT 



TGCGACCGCCCTCCGAATTGTGTACCTTCACCTTCCCATCGTCCTCTTTCCAAO 



GAAAGAACCACTGCTCACCCCCTCCCCACTCATTACCAACC 



1 



TGCT CGCGT AAATCCC TACCAAGCCTGCGATC7C1 



ill 1i iil 

ACC TCCTC TC ACACGATGACC ACCC ACACTCCAACT CACAC ACCCTC CAGACTCC T AC CGCACGCACC ACTC6CGAAT AT TCCCC AAT C6 ^ 



ACCACCCCATTTACCGATGGT 



1 



TCBGACEC«M«ICEAK*CACTCTCClACieCTCEBTETEACCTISACTCTSTECCACCTCTCA6GAICtCCTCCETCBTC»CCCCTTAT«£CCCTTACC 



1 



Hi 



^rrrTCACCtAGtCAT GCCaGTGAA.GAACAAGG^ 



CCCCCATTCAGCCACCGTCCTCCGCGCCATTATI 



ii 



ccaccaatacccaaccccaacccagccccttcceaatbtactc 



CAATTCTATTTCAC ACTCCCTAACTA6ACTACTTTACGGAG CCCTACAATTCCACCTCTAECCCTCAAATCCCTACACATCTI rrrTTlrlTcir 

cttaacgtaaagtcIgkccatt^ctcaicaaakcctccc c^ 



1 rTeftggifirAA ACAGGATTAGATACCCTGGTACTCCACCCTCTAAACCCTCTCGATTTCCCGAT^ 

TCCCAGTACACCCTTTCCCACCCCTCCTTTCTCCTAATCT ATCGGACCATCAGGTCCCACATTTCCCACAGCTAAACCCCTAACCCCAATCTCCAACCACGGGCATC6ATTGCACTATTTAGCTC 



s - 



CM 

ii 



tcaattcacggcgccccgcacaagcgctggagcatgtcgtttaa'TTCGatgcaacgccaacaaccttacctactcttgacatcctaaga iooo 



CGCCTGCGGAGTACCGCCGCAAGCTTAAA ACTCAAA „ 1TTrt 

CCCGACCCCTCATCCCCCCGTTCCAATTTTGAG1T1 ACTIA ACTGCCCCCGGGCGTGTTCGCCACCTCGTACACCAAATtAAGCTACCTTCCGCTTCTTCCAATCGATGAGAACTGTAGCATTCT 
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ACACC TCA£AWTCACCTTGTGCCTTC6CCAACTTAW 

TCTCCAGTCTCTACTCCAACACCGAACCCCTTCAATCTCTGTCCACCACCTACC^ 



1125 



ii|||sii 



ni 



IT 6 / 11 

»eCCACTTeSTCCCCAACTCAAACCACACTCCCACTCAT AAACTCCACW 

TCCCTCAACCACCCCTTCAGTTTCCTCTCACCCTCACTATTTCACCTCCTTCCACCCCTACTCCACTrCACTACTACCCCCAATCCTCATCCCCATGTGTCCACGATGTTACCCCATATCTCTCC 

E lis 



1250 



11 



i* ii! ill 



CAA6CCAACCTCCC*CCTCCACC( UATCTCATAAACTACCTCTAA6TCCGCATTC^ 

CTTCGCTTCGACGCTCCACCTCGCTTACAGTATTTCATCCAGATTCACCCCTAACCTCACACGTTGACCTCACGTACTTCACCCTTACCCATCATTACCCCTTACTCTTACACCGCCACTTATGC 




I 5 
11 



l! 



TTCCCCCCCCTTCTACACACCCCCCGTCACACCATCGGACTCCCTTCTACeAGAAGTAGATAK^ 



AACCCCCCGGAAC AT6TCTCCCGCGC ACTGTGGT AC CCTC AC CC AACATGCT CTTC ATCT ATC6AATTCCAAAAC C TCCCGC AAATCGT CCC AT ACT AAGTAC TCAC CCC AC TTC A6CATT6TTC 

•8 _ 5 158 5- 5 §r- 

■» t.8.S 



1 11 1 Hi if 11 



6TAACCCTACCSCAACCTCCCGTT6GATCACCTCCTTACTGAA6ACCACA CACACCGACTCCTCACACACATTGCCT 
CATTCGCATCCCCTTGGACCCCAACCTACTGCACXAATCACTTCTCCTCTCTCTCGCTCACGAGTGTGTCTAACCCACTATCAACATCTGTTTACTCCTCTTCTTTTCTAATGGGAACCCACACA 



AGCTCACGTGGTTAGAGCGCACCCCTCA1AACCGTGAGCTCCGTGC 7TCAAGTCCACTCACACCCACCACTCTCAGAGTCACTGAAACCAAAGGTCTAGTGTAAA1AAATAATAACTAACACA1A 
TCCACTCCACCAATCTCGCGTCGGGACTATTCCCACTCCAGCCACCAAGTTCACGTCAGTCTCGGTGCTCACACTCrCACTCACTTTCCTTTCCACATCACATTTATTTATTATTGATTCTGTAT 



AAT6ATTACTTATTAAATTAT6ATCAAATCCCGATATA6C TCACCTCGCACACCGCCTCCCTTCCACCCACCACCTCAC 

TTACTAATCAATAATTTAATACTACTTTACCCCTATATCGACTCGACCCTCTCGCCCACCGAACGTGCGTCCTCCACTCCCCAACCTACCGCCAATACAGCTCCTCAATAGTAACAATTCATTTA 



11 



TACAAAAAGACAAATTAATAAAGTATTAA6CTAGTGATTAAGCTATTTTAAATATTACGTTATTTTATTCTTAGTTTTAAGTTTAATTTTAAATTTAATTTTCTTTTAGTTTATTTAACCATGAT 

I iii | i i —.it ■■ «■ ■ » » ♦ - t t . ■ ■■■> . ■ t .. .< 2000 

ATCTTmCTCTTIAATTATTTCATAATTCCATCACTAATTCCATAAAATTTATAATCCAATAAAATAAGAATCAAAATTCAAATTAAAATTTAAATTAAAACAAAATCAAATAAATTCCTACTA 
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1 



TJCAAAATAC^ 



AAAACAAA^ CTCTTTTC TAAAAAMT^ 2250 
TTTTGTTT 



TTTC6ACAAAACATTTTTT TACAACTAACTTCTT'TTCCTTACTTCACAAATCAACTTAtTTTTATCCCTACTTTAACTC6CGTCAAACTTCACTTTTCCATTTC*CT*ACTT 



E 

-f> 

8=i5 i« 



CTCCA«C>T;7C**TIC«:C»TCCCMCTTCCACCT, C E ^C 6 TlIcTCTCCCCTTCTICCT C e«C«T,^c e CT T TTC C «>CC.«1CAGCT.TTeiCC C C^»I«GGTTC»T^ 



! 4 



7 



CCACTACAI^TC TAgATCAACAA^^ 2S0 0 

1 

^^.^GAAAGtCA AAGACICAGTAAQt^^^ 2625 
TCATCCCC6CTCSCTTTCCCTTTCTC6GTCATTCACTA TCGTTATATCACTCCTCTTACACAACCCTTCGTCTTACT^ 

_ I 

I 

AACCACAACTACCCCCSCACACCT&ATATCCTGTTTCAAGi 
TTCCTCTTCATCCCCCCCTGTCCACTAT, 

nil 





1 T.. 

C^T CCTCCAABCCTAAAIACTCCTGATTGACCCATACTCAACCACTACTGTGAACCAAAGCCCAAAACAACCCCC 
AGGACAAACTTCTTCCCCCCGCCTAfiMGGTTCCCATTTATGAGCACTAACTGGCTATCACTTGCTCATGACACTTCCTTTCCG 



paaatacaacctiUAACCTTCTA CGTACAAGCAGTCCSAGCCACCCCAACCTTCTCACTCCCTAC^ «an7R 

cactcccctcactttatcttggactttgcaacatgcatgtt cctcaccctccctccccttggaacactcacccatggaaaacatattacccactcg 

» T . „r^,UAAeCCAC,CtTA^^ 3000 
TATCCCCTCCCCTTCCCTTTMCTCACAATTCACCCCCT TATCAACCTTCCATATCTC66CTTTCGCCCACTACATCGCTACCCCTCCAACTTCCAACCCATTCTCATTCACCTCCTGCCTTCGC 

T/ S i 1 I I 1 II 

jjL. 1 . nf RRtTCACTTGT GCCTGGGGGIGAAACCCCAATCAAACCCGCAGATACCTCGTTCTCCCCCAAAT CTATTTACCTABAGCCTlCACGIGACACCTTTGCGGCTAGA ^ t ^ g 

TGATTACAACTTnTAATCGCCTACTCAACACCCACCCC CAtTTTCCCGTTACTTTCCCCCTCTATCGACCAAGAGGCCCTTtAGATAAATCCATCTCCGAACTCCACTGTGCAAACCCCCATCT 
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^pg^tfctep fi>SS19l Ste end Sequence . 



\ i i i i 

! *w irriirrruTcr T ACCAAT ACC AA AGAGT GAT ACT CJ 



GCACTGTTTCI 
CCTCACAAACCCCATC 



SCCTAGCCGGCCATCCCCCCTTACCAACCCCATCCA^^ 3250 



^C^ACCCCCC^TGCTTGGGCTACGTTTGATCCTTATCGTTTCTCACT 



tccttC J T j^^ 3375 
TGCCGGTCGATTCCAGGGCTTCACATATAATTCACCCTT TCCTTCACCC 

^^ ^mjg^gws^^ 3500 



i — a 



H 1 HI 

1 1 » • . ir*»«ortTfA*ftAArrrCTTrC£tCCAAGACCAACC£TTCCT6TCCAACI 



^TtCTWM^tjj KM^Mgl^ 3625 

TttCMAC^TT j^nttTCUg^ 3750 

m CGCATCAGCTACCCTTTGTCCAATTA,AAGGACATGA ACCA^ 



CttJ^mCTTAACACACgACA^^ 3875 



CCGTTTAGGCCT 



■gjy^g^^{TCTCTCTCTCTACTACTG CTCCSAGATGCCTCCACTTCATTGACTATGGTCTC*ACCTCCTTTTCCCTCATTCCCTTTCCCAAATCATTTCCCATGACTTTTGCC 



if irAIMTCCTCACCT AGACAATACTCAGGCGCTTCACACAAC^CGCTGAAGCAACTAGCCAAAATACCACCCTAACTTCGCCA 



TGT6TCCACCAGTCCATCTCTTA 



rCACTCCGCGAACTCTC TTGAGCCCACTTCCTTCATCCGTTTTATCGTGGCATTCAACCCCTCTTCCACCCCCCCCCATCTAACATTCCCCATCGGGCACTT 



" S r _' _ n-r-TTiTT > nAirtrtcr APT C ICC AAACACGAA* 



^^WcAGCtCCC.G^ 



CCAACTTGCCCAGCTTCTATGGTCCACCGACCTTGACAAA1AATTT 



TTCTCTCGTGACACCTTTCTCCTTTCACCTGCATATCCCACACTACCCACGCGCCACGACCTTCC4ATTAACTACCACA 
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flU. H (WW 



:, July 22. 1938 2:19 PM 
fifiA HE rtes Map p > m« «'» and Sequent* 



1 



GTAGCTTTCTCTTCGTGGACTACCTTCGGGGTCATTTGCCCCCCGCATTW 




]$ ! 1 ii ft 

ctgtct ccacccga£act^^ ,375 



5 



1 i 1 1) WJ 

CGATAG6T 6C6A6CCTTTGAA6CA6TCAX ,500 



„ , I \\\ \\ \ 

TACT TTGACTGCCCCCGTCTCC^ t 
ATCAAACTCACCCeCCCA^CGAC^^ 



3-5 



li 



1 \\ )] I 

caccaIktaccaaagtacctcata^ , 750 

CTCGTCCATGCTTTWTCCAGTATCACTACCCCACCAAGACTTACCTTC 



ls| 5 . 



GTTTcLcCTC GATGTCGGCT^ 1,875 



1 V 

1 1 



tc tgccctccgcctagcatgattcattcccgctcctcctactaccagacgaccggactccacccatca^ sooo 

^^^^^ 



8 I 



I 9 



aa ctgcicaaagcatctaaccacgaaacttcccaagacat6agtcatccctgactttaagtcagtaaggcttctt^ 512£ 
ttcaccactttcctacattcgtcctttgaacggttctctactcagtagggactcaaattcagtcattcccaacaacatctgatgctgcatctatccaacccacacattcactacactcagtaac^ 
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__ -,JuJy 22, 18982:19 PM PaBe 6 
nnAREgteaMao H>5S19) S t» and Sequenca — 



IFF 



CCTAACCAATACTAATTCCCCWC^ 
CC ATTCmAT<UT1AACC6CCTCTCCCUATTCATAT^ 



5250 



am:a >uaaaactaaatatacaa£a^ ^ 5375 

TTCTTTCTTTTCATTTATATCTTCTWATIATCTTTCTTTTATCCTAACTCCAACACTCCTTATTCTTGCTCACTTTCC^ 

S 

fl 11 

M^AAAACACttUiTTATCAAAGAATTATCCTCC^ ^ 



TCTCTTTTCTCCTCAATACTT 



ACAC7ACCBCACCCCCACC 

» - I * 5519 

TCTCATCCCCTCCCCGTCC 



TCTTAATAGGACCCCCttTATCACCCCACCTCCCTOUCTCTCCTATGC^ 
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DNA seauence and restriction map of the H. influenzae Rd rrnB operon (16s-spacer- 
^HSHSSS. rastriction sites noted for enzymes cutting 5 times or less. 



H 



8-| 



1 



J^TT^TCATGGCTCAGAT^^^^^ I25 



AACUCTCAAACTj 



T^TACCCACTCTAACTTCCCACCCCCCTCCCAATTCTGTACGTTCACCTTCCCATCCTCCTCTTTCB 



: TCC T CACCCCC T6CCC AC TC AT TACCAACC C 



¥f 1 7 



ill 11 i 



isilli 



111 



5? 

1? 



CCGAACCCTGAC 



cr4CCCATCCCCCgTGAATCAACAACGCCTTCCGGTTCTAAACTTCTTTCCCTAT- 



TC ACCAACCT TCATCTCT T AAT ACT ACATCAAATTCACCTT AAAT ACACAACAACC AC 



CCCTTCGCACTCCCTCCGTACCGCGCACTTACTTCTTCCCCAACCCCAACATTTt 



CAAGAAACCCATAACTCCTTCCAACTACACAATTATCATGTACTTTAACTCCAATT7ATGTCTTCTTCCTC 



500 




625 



cccgattgTScTccctcctccccgccattatccctcccaccctcccaattagccttattgacci 



11 



AATTCCATTTCAGACTCCCTAACTAGACTACTTTA6CGAC6G< 



ICTAGAATTCCACGTCT, 



ACCGCT GA AATCCCT AGAGATGTCC ACCAAT ACCGAACCCCAACCCACCCCCT TGCCAA TGTACTCA 
CGCCACTTTACCCATCTCTACACCTCCTTATCCCTTCCGCTTCCGTCGGCGAACCCTTACATGACT 



TTAACCTAAAGTCTGACCCATTGATCTCATGAAATCCCTCCCCATCTTAAGGTCCACATi 

o 5- §2" 

- - ills l£a 

1 # iii 
mm ,.,. m ^...^.M..»..^i.ctro.i^^ •" 



CCGAGTACACGCTTTCCCACCCCTCCTTTCTCCTAATCTATCGGACCA 



TCACCTGCCACAUTCCCACACCTAAACCCCTAACCCCAATCTCCAACCACCCCCATCCATTCCACTATTTACCTGC 



CCCTCflCGABTACGKffiGC/^ .000 
CGSACCCCTCATCCCGCCGTTCCAATTTTGAGTTTACTTA ACTGC^ 
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W«MKtay. July 22. 1998 237 PM 
tmBMro f1>S273t SM«ndStauac8 




i i Mi 



RA ^TcIcACATCACCTT^ !l25 



III 



- 2 W 




5— e 



CCC*CTTCCTCCCGA ACTCAAAC6ACA^ 
CCCT CAACCAGCCCTTCACTTTCCTCTCACCGTCACm^ 



I 1! i s ii 



CCCGAATCACAATGTCCCCGTGAATAC CT 



AACCIUACCTCCCACGTCCACCCA^ 
TTCGCTTCCACCCTCCACCTCCCnACAGTATTTCATC^ 



1375 




1? 



3« g 



rcbceccccri cTACACACCccccGT^ )SOo 

^^nnACATCICTCCCCECCAC^ 



ii i t 

7 1 I 



n i 



lis 



TAACCCIACBCGAACCTCCCCTT^GATCACC^^ 1625 
ATTGCCATCCCCTTCCACCCCAACCTAGTGGAGCAATCAC^ 



W f v 

AA A C CTTAAAACATAAAAA6AAAATACACTATCTTTAAT^ n50 
TTTCCAATTTTCTATTTTTCTTTTATCTCATAGA^ 



— oj 

Si 



C AT AAC T T T AT T A CA TT CTCT T ACT GT TC TT T AAA AAA T TCGAAAC AACCTC AAAAC AACACATTTT C G A6A6AAAC T C TGACT ACCC AAAACACT AAAC TCAAACACAGAAC TCA AAACCAAAC ^ 
CTATTGAAATAATCTAACACAATGACAAGAAATTTTTTAACCTTTCTTCCAm 

TCTAAAAACAAAACCTCTTTTCTAAAAAAATCTTGATTCAACAAAAGCAATCAACTCTTTAGTTGAATeAAAATA^ ^ 
4Qj^y|yXXGTTTTGGAC AAAAC ATT TTTTTACAACTAACTTGTTTTCGTTAGTTCACAAATCAACTTACTTTTATGCGTAGTTTAACTGGCGTCAAACTTC^ 
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Pago 3 

Wednesday. July 22. 1898 237 PM 

.-«iu.p n»S7ra Saa and Sequence — ^ 

4T , GAGGTTCTATAG TAA, rG ACTaA6CCJ*CAAGBJB6AT6CCTT6C^ 8t25 
TMM TCtAACATAttAAT«ACTQ»TT«^^ 



' • * • i a*»i i ir itpi i inirrrreicc 



eB ^ A CCCAC.ACAICAA CAATCJA»A^^ gSQ 

-RE I 

111 'l 

MtcT CATCGCCGCtCCCTT.CCC.TKTCOCTC»TIC «CUTCCTT^ 



E 




G GCCGCCCATCCTCCAACGCTAAATACTCCTCATTCACCCATAGTGAACGACTi 



^^ABAACTACtttGCCACACg^^ 2500 



J? 



CCCCGTGAGCCUGTGAMTAGAACtTGAAACCT^^ 262 5 



s 



i in 

CGC1TATCCCCTCCCCTTCCCTTTCCCTCAGAATT: 



TicACCCCCTTATCAACCTTCCATATCTCCCCTTTCGCCCACTACATCCCTACCCCTCCAACTTCCAACCCATTCTCATTCACCTCCTCCCT 



S = = UJ 

Ml ii ii 



ACCCACTAAKTUAAAA^ 2S7S 
TCCCTGATTACAACTTTTTAATCCi 



cccctactcaacacccacccccactttccccttactttcixcctctatccaccaagaccccctttacataaatccatctcccaactccactgtccaaaccccc 



- i B 

H 

TA6AGCAI 

ATCTCGTGACAAACCCCATCCCCCGCTAGGGCCGAATGCTTGGGCTACCT 



1 \ ±JJ 



T ^«mc« tt T.« M «CCCAUCC«^^ 3000 



TTCATCCTTATCCTTTCTCACTATGA6TCCTCTGTCTCCCCCCCACGATTCCAGCCACCACCTCTCCCTTTGTTG 
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r. JuJy 22. 1998 237 PM 
ft » 5273) Site and Sequence, 



1W 1 11/ 

CCAtUCCCC CAttTAAGCTCCCCA^^ 3 , 25 
6£TCJCGCCfiTCCATTCCACGG6TTCAGATATAATTCACCCTTTK 




ACTCCCC C T6CCC^ 3250 
TCACCCCCACttGCCTTCTACATTCCCCCCACTTTATATCCTGGCn 



§ si I 

H 1 HI 

^CT MCACAACTCKAATttT^ 3375 

tccatactcticacccttaccactctattcattcctattttccccactttitccccaaccccccttctccttcccaacoacaccttccaatta^ 



\ 1 



CTUuACC CTACUCATggAAA^^ 350 0 
CACTTTTCCCATCACCTACCCmCTCCAATTATAA^^^ 



tttacgcaaatcccgacttccttaacacacacagatgatcacgaccctctacgcagctcaagtaactcata 3fl25 



"IACGCCTCAACCAATTGTGTCTCTCTAC1ACTCCTCCCACATCCCTCCACTTCATTCACTATGCTCTCAACCTCCT 



TT7CGGT6ATTCCCTTTCCGAAATGATTI6CCATGACTTT 



ACCCAC AC AGCTCCT C AGCT AG AC A A T AC TC ACCC GCT T G Al 




.CAGAACTCGGCTGAACGAACTACGCAAAATACCACCCTAACTTCCCCAGAAGCTGCGCCCCCGTAGATTGTAAGCGCTACCCCC 



TGGCTGTGTCCACCACTCCATCTCTTATGAGTCCGCCAACTCTCTTGAGCCCACTTCCTTCA 



TCCGTTTTATCGTGCCATTGAACCCCTCTTCCACCCGCCCCCATCTAACATTCCCGATCGGGC 



3750 



8 *- * 

TGAACC TTCA ^CTCC/U^TttCA^^ M 75 




— *,ON5- _ 
«° — o«"= in (v 



GTCTCATCBAAAftMA ACCAgTC^^ ,000 
— ^^^^^^ 
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fmBMaon> , » n ) SHaandSaoue 



Mi i \ I il 

M rr A .l J Cte tC>TT C ,C^CC.CeTt e M,T.CCK CCTnOCaTTT t U 6 TTCT^CA.a»T t ^.»C 6 TCCTCTC^C Wg TCTCC ^ 

IU f M i M H "f I 

I I I ____JL_... #»l»t/?c^ccMCCTTTCCTAATCAC(^TCCCACATCCT6ACCTTACT&CAATGCTATA^CCAACCTT^CTCC6ACACA6AC< 



ACCCATCAAACTCACCCCCCCABACCACCCTTTt 



TTCCCATTCCCTCCTCC TCCTTCCAAACCATTACTCCCACCCTGTACCACTCCAATCACSTTACCATATTCSTTC&AATTCAtGCTCTCTCTCT 



i Ji 



TCAGCTCGTCCATCCTTTCATCCACTATCACTAI 



^^CACTTACCTTCeCCCTACCCA^^ 



* JJ8JSLJ 



reCACCTCCATCTCg CKA^^ « 



1 1 



■ ^^nCC^TCCT CCTAGT^gAC^ 



1 1 



5- I 
SI 



CTTTVACTC*CT*AGCCTTCTlCI IU! »CI*CC*CCT*CAT>GCTICCCTCT6TtA6TeAT6T6AC1CA 



// 1 j. 



l,. 1 r,».*C.ACtCTCA*CT C TTT T TC C <UCC 1 «»CTC»CT>*C»E»TC>».ACCC>>GCA»AT*>*ACC* l UCCCA»A e A 6 A«CmAAC ^ 
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Wednesday. July 22, 1898 237 PM Pa P g6 
imB Map O > 52731 Sfta and Sbi^wwc e — ,,,■ „ —, „ , — ■ ■ - 

ru — 

I 

AUAAACAAACAAAACTAAATATACA^^ 
TCA TTTCTTTCTTTTCATTTATATCTTCT^TTATCTTTCTT™ 



5125 



8 

! 

TAAAiUUCA A A ACACCACTTAT^^ 
ATTTTCTCTTTTCTCeTCAATACTTTCTTAATA^ 



T6TCACA6TACGCCACCGCCACC 

i * \ ' » ■» 5273 

ACACTCTCATCCCCTCCCCCTCC 



WO 99/05325 16 , ^ PCT/US98/15464 
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EcoRI ribotype RFLPs of E. coli isolates from diverse 
sources, including the genomically sequenced strain 

MG1655. 





-I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 



LEGB1D: 

Leftmost portion of the figure depicts the predictable EcoRI ribotype 
RFLP profile of the genomically sequenced E. co/# strain MG1655. 
Actual RFLP profiles of this strain appear in lanes 1, B and 17. 
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Ftk.7 

DNA sequence and restriction map of the E. coif MG1655 rrnA operon (16s-spacer- 
23s-spacer-5s) f with restriction sites noted for enzymes cutting 5 times or less. 



AAATTCAACACTTTCATC 
TTTAACTTCTCAAACTAC 

^ 16s begins 



R -_ « 



5S|- 



TTC6KCCTCTTCeCATCCeATClCCCemT«WlAa^ 



ffl 



01 



till I 




6 - 

n 



I 



1 1 



ii - B 



Sj fell 



11 1 1 11 

„ , . — — - - rCCCCCCCCTTCTACACACCCCCCCK, 
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PLC*. 1 C**Hp( 

DNA sequence and restriction map of the £ co// MG1655 rro4 operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 1). 



1 



TCCCCACCCCCCTT ACCiX TTTCTCAT T UTCACI CCCCTCAABTCCT AACAACCT AACXtt ACCCCAACCICtttl TSOATCACCTCCT1 ACCTT AiAAACCCTT tt T TBAASTCCTCACACASATTCTC TBATOAAAATCAC 1SS5 
AttCCTCCCKC^TCCTOUUtfACIAACTACTMCCCCACn^ 

16s ends 4 



LI. 



a__a_ 



r»ctAAAACCKTACAC«TTGTACCTCAeCTKTTASACCttACC£CTCATAAatt^ 17KO 
CTCAT^BCACATCTCCCAUATCCACTCCACCAATCtC^ 



if 



1 ^ 



TCACCrCCCACACCCCCTCCTTTCCACCCACCACStCTCCCCTICCATCCCCCATABCTCCACCATCTCTGlACTeATTAAAIAAAAMTUTTCAft^ 1U5 
ACTCUCCCTCTCCCCCACCAAACCrCCCTCCTCCACACCCCAACCTACCCCCTATCeACCTCCTAGACAtATCACIAATTTATTTTTTA 



CATCAACCTCAAAATTCUAACACT6AACAACCAAACTTCTTC£7Ctt^ 2030 
C TACTTCCACTT t TAiKTTICTIUCTTCTTttTTUAACAAttACTCACACACTTTAAA 

T 23s begins 

s 

MSCACBTCCTAATCTreaATAACCBTCraTAASSTCAmCAACttTTATAACCtt 2175 
T TCC TCC ACCAT TACACCCT ATTCCtACCCAT TCC AC T AT AC 1 TCCCAAT AT TCCCCCC t AAACCC TT ACCCC Tt I CCCTCACACAAACCTCTOrCATACIAATTBACTTABSlAICCAATlACTCCCCrTCCCCCCCnCACTT 



ACATCTAA61ACCCI 



X ACCCCACACCCTCAA1 CACTCTCTCTCT T ACTC8AABCCTCTCCAAACCCCTCCBAT ACACCCTCACACCCtCCT AC 2320 
TTCCC ACACCTTTCCCCACCCTAT CTCCCACTCTCCCCCCATC 




ACAAAAA TCC*C AX 6C 1 CT CACC1CCATCACT ACCCCCCSAC ACCTC&T AT CC T CT CT CAAT ATCCSCC6ACCATCCTCCAACCCT AAATACTCCTCACTCACCBATAC16AACCACrAI 
rCTT T T TACCT C T ACfiACACTCSACCT ACTCATCCtXCCC TCTU ACCAT AMACA6AC T TAT ACrXCCCTCXtACCAtt 



VP 




CCCKTCCCCTCACmTTClTC^TTTCMACATCCATCTTCCTCACCCTCCTtt^ 



f 



AACC6ACTCT1AACTCCCCCTTAACTTCCACS6TATAUCCCSAAACCC6QTCATCT ACCCATBOCCACCITCAAfiCTTCCCTAACAC 27S5 
TTCCCtCAI^ATTQlCCCKAATTCAACCTCCCATATCTCSCCTTTCGttCACTACATCKTACCCCTCCAAC 



CCTCAAACCtCAATCAAACCCCCAUtACCTCCTTCTCCCCUAACCTATTTACXIACCCCCTCCTCAATTCATCTCCCCC^ 2900 
CCACTTTCCWTUCTITC^CCtCTATCC^CAACACCCCXTTTCCAtAOTCCATC^ 
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a ^rfrtion man of the E. co//MG1655 m?A operon (16s-spacer- 

^Tpa"^ CUttl " 9 5 ^ ° rteSS 

K (continued 2). 

: 518=1;- - 5 = 




I 1 W 

flip M 1 1 1 



\ i__Ll 



1 ......t ri >-ti-riiriiirricric 



MiJicTisACCcutiictTTwicceinii 

sir" 



39 IS 

TCCATCCC 



ft ! IP fl 1 



^TCCTTCTCCMTAACTTCCCACC 



1 1 ii i d ! , M 1 i i 1 
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DMA sequence and restriction map of the £ co// MQ1655 rmA operon (1 6s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or iess 

(continued 3). 



CICCCAAACCACCT C AT ACTCA ICCCCTCC TTC 1 CAAT CCAACCCCt AT CCET UACCC AT AAAACCTACTCtCCCCATAACACCCT WTACCCCCCAACACTTCATATCCACCCCCCTCTTTCCCACCTCCA1CTCCCCTCATC < 



ACATCCTCCCeCTtAACTlCC^CCCAACeCTATCCCTCTTCeCCAT^^ 4640 
WACMCCCC^TKATCCACra 



1 1 



^^^TP^CCATCAtTCCTCTTCCCCTTCTCATCtC^ 47B5 

ftCAACgAA£CTT6AACAC8ACCACCTTMTACCCCCCCTCTCTAAgCCCACCMTCCeTTCACCTAA 4030 
CACITCCTIMAACTTCTCCTCCTCCA^ 

23s ends 4 

8 



] MM 



CATACACATTAmCACAACCCACAAKCCTCTCAIAAAACACAAmCtCtra SO?S 
CriTCTCTAATTlACTCTT^ 



AAACCCACCi 

f 5s begins 

»7 

1 

Ssends J 



ACACTACCCA ACTCCCACCCAT S097 
TCTCATCCCTT6ACCCTCCCT; 
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fib. 1 



-J — I 
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Restriction map of the rmA operon (16s-spacer-23s-spacer-5s) of £ coff MG1655, 
^restriction sites noted for enzymes cutting 5 times or less (continued 1). 
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Restriction map of the rrnA operon (16s-spacer-23s*pacer-5s) of £ eo//M01666. 
with restriction sites noted for enzymes cutting S times or less (continued 2). 
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DNA sequence and restriction map of the E. coll MG165S rmB operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less. 



11 w 

AAJ 

Tr 

4 16s begins 



toll 



5 ! ! 

1.1 11 



AAA7 TeAAaACTT10ATCATCC£TCACAnuUt£CCTreCCCACttCTAAC^ H5 
TT7AACTTCICAAACTAC7ACttACTCTAACTTttCACCCCW^ 



CCCAIAAClACTttAAACC«lACC7MlACCCCA7AACETCCCAA«ACCJ 



8- 

11 



ii i 



XTTCCCCCCTCTTCCCATtCOA7CTCCCCAWTKCAT1ACCTiLCTAC0TCCCCTAACCCCTCACCTACCCCACClTCCCtA£C 200 



2 : 



111 1 



71* 1 



7PS7C7tUCAC^7tUCCJ£CCACAC7CSAAC7CACAtACtt7CCiUU^^ 435 
Ata^TC7CC7A£7C*7C6ClC7CACC77CAC7CTC7CCCA«7CTeACCA7CCCC7CC6W^ 




mTCAeeC ttCC > CT lAC CT AB7AAACTfAATAteTTTECTCATTraST^ 560 
WUUCTCCCCCCTCCTTCCCTCATTTCMUATCfiAAAC0ACTAACTKAATCCCtBTCTTCTTC8TCttC(UTT 



91 



t7TTBTTAAQTCACATC7BAAA7CCCttCC£TCAAC£TCCCAACT6CA7CTEATACTC^ 725 
AnCJW7C7ACAC7nACCBKCC6ACT7CSACCC7TCACC7ABACTA7CJU^ 



1 1111 




;CCTeSAXCAACAC76ACrX7CASS7CCSAAACCCTECCSACCAAACACCAT7ASATACCCTCC7Ara 870 

CC T 7CCCCCCCCCCACC TK 7TC 7CAC TCCCACTCCACCC 1 T TC6CACCCCTCC Tt ICTCCT AA1 CTATCCCACCA1 CACCt CCC0CA r T TCCT AC ACC ICAACC7CCAACACBCSAAX7CCCCACCBAACCCC1CCAT7CCCCA 




mP* 1*1 



1 1 



7AAC7CCACtCCC7SCCCASTACCSCCCtAACS77AAAAC7CAAA7CAA776AC 6 CC CCC C€CtACAA^^ 1015 
ATtCACC T CCCCCACCCC 7C ATCCCC6C67 TCC AAT T T 7CAC1 T ? AC T t AACTCCCCCCCCCCCT BT TCGCC ACC7 CC7 ACACCAAA7 7 AABCTAC CT TCCCCTTCTTeCAATMACCACAACTCTACCTCCCTTCAAAACTCTC 



A7BACAA7B7CCC7TCCCGAACCC7CACACA«7CCtCCA7CCC!C7CC7CA^ 1180 
TACTC77ACACCCAABtCC77CIXAClC7E7CCACSACC7AC£CACASCACT re » CT ^ 



? S. 



11 1 1 15 



ClUCTCA7AAAC7raACSAACC7t«CA7SACC7CAAC7CA7CATCtXCCrtACCA^ 1305 
BCTCAC 7 A7 7 76ACC7CC7 7CC ACCCC 1 AC 76CAC 77CAC7AB7 ACCCCCAA T6C7BC 7CCCOA7BTB7CCACSATOT 7ACC6CB7 A7CI7 7C7CT7CCCTCCACCCCTC7C677teCC7CCAC7A77TCACCCACCATCACtCC 



1 111 111 IP 



f 1 



A77CeAC7C7CCAAC7C6AC7CCAtEAAC7eCCAA7C6C1AC7AA7C676BA7CACAA76CCJu:CB7BAA7Att^^ 1450 
7 AACC 7CACAC67 76ACC I CACB7 fcC 7 1 CA6C C 7 T ACC6A7 C A 7 7 ACC ACC 7 AC7C 7 7 ACBB 76CC AC T 7 A7 CCAACCCCCCCCAACA7B7 C7 CCCCCCCAC7E7BC7ACCC7CACCCAACC7777C 77CA7CCA7CEAA77CBA 
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DNA sequence and restriction map of the £ co//MG1655 rmB operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 1). 

16s ends ^ 

f ffi I 1 9 V I \ 

if 

1 .....iiMirrtxiptrirtriiiT 



. 1 

4 23s begins 



^ 1 Pi 
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1 "I 



. 5; ; . 5_ 
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1 1 I 
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DNA sequence and restriction map of the E eo//MG1655 rmB operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 2). 



i i 



tArCCCAZUUTCMAKACCSCACACAtACCCCCfXt^AAtCICCCTCCTB^^ ws 
ATCCCCTCT1 ACAA T A6TCCCC TC T6I C1CCCCCCC ACCA TTCC ACCCACCAC T TC t CCCrrTCTTeCCTneCCCCICCATTCCACCeTTTCACllCtAATTCACCCTTTCCtACACCCTTCCCCCTCTCTCCCTCCTACAACC 



i 



mA£AACCA££CAHATTTAAACAAACCCTAAlACClCA£ICCTCCACTCCCCC1CCCCCC^ 3100 
_MTCTTCCKKlAeiAAATTICT7TC*CAIIATCSMlMCCA^ 




5 - i 

1 



\\ \ 1 ^ 



TgCAACCTCUCTgTCACCCAlCCTClUCCTATCACAilSTrc^ 3335 
ACCCTICCACACtACACTCCCTACSACCKCAlASTCTTCACCCTIAeeAC^^ 



_____ _ , t Tccc««KAr.r.rT4AiATKCictAcnMicTiACictCu«ccccAtc^ : 

ICCCCrCCCCCTTTCCCCATCA6C?UCCTTT6TCCAATUTAACCACAICAA£CACAAfCACCtTTCCCCCCTCCCTCTrcCCATACAACCCCCCCCC 




rC.ftft M Tf**Eg"?*^ETCArcA£-Ag|gArTACCSl^ 1836 
ecCITTTAeTTCCCACTCCMACIACTCCTCCCTCATCCCACCACTTCGTTCT^^ 



ft 



CCCTTUUSACA-CTCCCCTCAjKCAACIACCCAAAATCCTCCCCtAACfTCCeSAC^ 1770 
CCCAACKTCTT6AttCCACTTCmCATCCCTlTUCCACCCtATTCAA^ 




ijlj 



AtAttAClCTCCAAACACCAAACICBACCTATACCCTCT&ACeceTCCCCCCm 3BI& 
TCTCCTBACACCTIICIGCTTKACCICCATAICCCACACTCCCBACCCCCCAC^ 



W 1 




i 



CC*JU»TCCTTB7CCtCTAiU8TrCCSACCTCeACCAATCCCefAAl6AtCCCCACSCTCTCTCCACCCCACAC 1060 
CC T T I AAG6AAC A3CCC A T I C AACCC fCCACC 1 CC I T ACCCC A T 1 AC T ACCCCT CCCACACACSTCCCCTC 1CACTCAC TTlAACTTBACCCACACTICIACCTCACArCCCCCCCCTTCrCCCTTTCTCCCCCACTTCCAAATC 



lATAttTTCACACTCAACATICACCCTiCATCTCTACCAIACCTCCCACCXTTTC^ 4205 
AlATCCAACTCICACTTClAACtCSCAACUCACAICCIATCCACCCTCCGAAACTTCAC^ 




5 o 

1 \ 



CACACTCUTCSTCCCIASTilCACTCCGCCCClCTCeTCCIAAACAClAAC^ 1350 
CTCTCACACACCAC::»TC*AACtSi:CCCCCCACACCAKATTTCtCATT«C^^ 
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DNA sequence and restriction map of the £ co//MG1655 rmB operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 3). 



tec t cccaaA£Ca C C T C A T A C T C A T CCCCTCCT TC TCAA ICCAACCGCCATCCC TC AAXCCAI AAAACC T AC T CCCCCCAT. AACACCt T CAT AC CCC CC AACA6 T 1 
WACCCTTTmCCACT*TCAC*AS« 



\ \\\ '§ i! i 1 If 1 

Tr»r itrr TCGCCf TCAACT ACCTCCC AACCCTATCCCICTTCCCCATT |AAA6TCCTACCC6ACCtC6CTTTA6AAC01CCT6ACACA6tTC6ttlCCCTATCtCCCCTBCCCCC?C0ACAACtCACCCBCCC16CTCC'*6tM «6«0 
J^tlttACKCCACnCATM 

\'\ ft 

mMCCArrcCACTCCAgCCATCACTCCTCT TCCCCTTCTCATCCCAATCCCACTCCCCCCTACCTAAATCCCCAACACATA^TSCTCAAACCATCTAACCACCAAACTTCCCCCCACATCACTTCTCCCT&ACCCTTTAACCC «7B5 
CTCTCCT6CCC fC ACCTCC C T AC TBACCAC AA6CC CAAC ACT ACCGT T ACCCT 6ACCSCC C A TCCAT T T ACCCC TTC TC TAT TCACCAC T T TCC1 ACAT TCCTCCTT T6AA C CCCC C TCT ACT U A C A CCC A C TCCCAAAT TCCC 

S U ill 



9= 



rrr TCAACtAATCT IQAAOACCACCACC T T CAT ACCCCCCCTCTB T AACC6C ACTCAT CCCT TCACC T AACCCO T AC T AAT BAACCCTCABCC T TAACC T I AC AACCCCCAACCTCT TTTCCCCCATCACAfiAAfiATT T TCACCC «O30 
AttACTKCTOAACTTCTCC^ 



23s ends 

i 5«= 



f i Hi i? 

IT T TKCTCCCCCCACT AGCBC6C " " ' — — 

(AAACCCACCCCCCTCATCCCCCC. 

^ 5s begins 



TCA1 AC ACA T T AAA T C ACAACCC ACAACCCCT C T CAT AAAACACAAT T TKCTCCCCCCACT ACCKCCCTCCTCCCACCTCACCCCATCCCCAAC T CACAACT CAAACCCCC 1 ACCCC CCATCCT ACTC TCCSOTC TCCCC ATK 5075 
IcUTCTCTAATlTACie^^ 

CACACT ACCCAAC T CCCACCCA1 SOU 
CTCICATCCCTTCACCBTCC6TA 

5s ends 4 
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Restriction map of the rrnB operon (16s-spacer-23s-spacer-6s) of E. coll M61655, 
with restriction sites noted for enzymes cutting 5 times or less. 
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Restriction map of the rmB operon (16s-spacer-23s-spacer-5s) of £. co// MG1655, 
with restriction sites noted for enzymes cutting 6 times or less (continued 1). 
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Restriction map of the rroB operon (1 6s-spacer-23s-spacer-5s) of E. coll MG1655, 
with restriction sites noted for enzymes cutting 5 times or less (continued 2). 
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DNA sequence and restriction map of the £ co// MG1 655 rniC operon (1 6s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 6 times or less. 



I 16s begins -- 5 _ 



1 



C?6TTTCC£T6AC6ABT6CCS3AC6ttTBASf AATBTCTCCCAAACTCCCTBATCSACC 145 
^TTACACACCCTTTBACfiOACtAtCTCC 



9 1 



^.♦^^TMAAiMMTACMaAlACCCCATAAteTCSCA^^ **> 



T«i.TrTeAiu M *ll^nu:eeie»eTCCAACTCA^ 



BCAAC 
COTTC 

1* 



if 



W 11 



C^^C^CWTTCCCTcIm^ 



3B 8 



1 



- 8 



1 n 



1 



11 15 If 




X TT ACCT6CTC TTCAC AT CCACCCAA6 T T T TCACACA 1015 



11 ii . W 



2 

Si! 



ft W 



ii in 



i I. *S I ISM 



#-l»f»»»»rVrre^Ttif f ArTTteT^tTCaTtACICCCCTCAACTCCTAAtAACCl AACCCTACCCCAACCTCCCCl TCtATCACCTCCTIACCTTAAAOAACCCTTCTTTCCACTCCTCACAtACATTBtCTCATACCAACTCAA 1S9S 

16s ends ^ 
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DNA sequence and restriction map of the E. coli MG1655 rmC operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 1). 



:tcti 

IACA1 

1 



lAeCAACCteTCTTCCCJU^ACACTCATACCTCCCeTTCCKTAe^^ 17*0 



3 1 



rTgAAAACTgAtCITCCCSTCATCTTTCAftATAmCCTeTTIAAAAATeTCtt^TCA^ 1885 



riMTteTeACSTTAACCCACTAASCCTACAtCCTCSATCCCCTeCCACTCAai^ 3030 
ScJJxaCTCCAA^^ 



^ 23s begins 



1 1 



CTTTeCACACAeTAUATTAAClCAATCCATACCTlAATCAWCM^ 2«75 




CAfilATCTBTCTTACTCgAACCCTCTCCAAABCCCCCCCAIACAra 3320 



A3TCA 
TCAC1 

8__ 



ixTiAATJicTccTCiJTCJttccAiArTiuAceACTtcecTC A rc m ^ "ft* 




SuitacSact^c^tataac^ 



*AC4£lA8CTeaiSSACCCAACC6AtrAATCllCAA^ 2755 



H 

TACACCAC1 
ATCTCCTCi 



i i ! ftp 



TACACtACTCUTCCCCAACCCCCTCAICCCIUCTIACCAACCCaiTCCAAACTCCCAAtACCCCA^teTTA 2900 
ATC^^AAA^mKCcIcTAScScAAW 



3 1 

i i 



CCC»l/KICMM!t«»CtCCC»»KC«tC!CCC"CKCCAOC«CC»CCAieTtKC^ "> 



S6 - 



I 5 



s * 

11 



rirrcAACCTttCCeAKCACCCTTATCCGTlGlTWlAM^^ 3190 
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DNA sentience and restriction map of the E co//MG1655 rmC operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 2). 
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CASA6AACfCCCCT0MCCAAClACCCAAAATCCTCCCCTJUC1Tl 



B1CCATT6TA0TI1ACCA 
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ACCACCCtATTBAACCCCTCTTCCCTCCeACTATACATCCACTTCCCIOA 
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DMA sequence and restriction map of the £. co// MG1655 miC operon (1 6s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or tess 

(continued 3). 



^ArTr^TTCCAA1CTTCfXCetT r fii r »A** rr C r CT^^ 





23s ends f g - j Ss begins 



CATecCCAACTCACAACTCMAtSCCCTAttCCtUUCCtAClCTeCt^^ 5013 
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Restriction map of the rroC operon (16s-spacer-23s-spacer-5s) of £. co// MG1655, 
with restriction sites noted for enzymes cutting 5 times or less. 
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Restriction map of the rmC operon (16s-spacer-23s-spacer-£s) of E. coll MG165S, 
with restriction sites noted for enzymes cutting 5 times or less (continued 1). 
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Restriction map of the rrnC operofftl6s-spacer-23s^pacer-5s) of £. co// MG1655, 
with restriction sites noted for enzymes cutting 5 times or iess (continued 2). 



Stwl 

SytTI 

SytTIl 

SiySPI 

Gynn 

T«qR 

TB»1«I 

ThHin 

UM1220I 
UM130SI 
UM1326I 




WO 99/05325 



39 / 67 



PCT/US98/15464 



DNA sequence and restriction map of the E. coll MG1655 rmD operon (5s-spacer- 
23s-spacer-16s), with restriction sites noted for enzymes cutting 5 times or less. 

iM \ \ \ 

ATCTCTCCCACTTCCTTACTCTCCCATCCCCABACCCC^ACTACCA^^ H5 
T j^g^e MTCaACCCATCACACKTACCCCTCTCttCCTCTeATCCTAeCCCeU 

^ 5 s ends " 5s begins f 

il I \ iM 

UAACCIAAAAACTCCTCCICATACCCACACTCCAACTCCCCACCTC^ 390 
AATTCCA1 TTT TCACCACCACT AT 5C6TC TCACCT TCACCCC TCCACT CCCAATCCTTCCCACCCCACATCCTTCAC TCCCTMACTCCTeCClTTTaJUCTlCCCACCCTCAACCCATCACAeCCTACCCCTCTCCWTCTCAT 

M # Hi 

CUTCCCCSCrACttCCTTTCACTTCTraiTCCCCATCCSClCACOlCSCAUACCCtCC^ 415 
BCTACtCgCATCCCSCAAACTCAACACTCAACCCCTACCCCACTK 

Hi 5 5 



1 $ * ^ V 

iTTACTACCCSTTACCTCAACCCATCCCTGCGCTTACACACCCC&CCTATCi 
„ > AAT t ATCCCCAATCCAC T T6CCT A Bt BACE C CA A T 0 T 0 T6CCCCCSAT Ad 

^ 23s ends 



CCCCAAAACACC I TC CCCCTTCT AACCT1 AACCCTCACCCTT C ATT ACTACCCCTT ACCTCAAC CCAT C CCTCCCCTT ACAC AEtCCCCCT ATCAACCTCCTCCTC T TCAACCmcnCACBACCCTT AAACCCTC ACCCACAA 6AO 
CCCCTTTTCTCCMBCCCaUKATTCeAATTCCCACTCCCAACTAATW 



CTCATCTCCCCCCAACTTTCCTCCTTACATCCTTTCACCACTTATCTCTrCCCCAT TTACCTACCCCCCACTCCATTCCCATCi 
CACT ACACCCCCCT T CAAACCACCAATCT AC6AAAG T CCT CAAT AOACAACCCOTAJ' «»» - " ' - ^ ■ — - - - ~ » > 



i i li 



gr 8- 



CTTrTCCACCKCCACBCtACAIACSSAtCCAACTCTCTCACCJtfXTTCTAJ^^ 870 
C^UICAMTCCCCCCTCCCCTCTATCCETCCCTTCACACACTCCICCAACATTTCCCTCC^ 



ACACCCCCCTCCATATCAACTCTTCCCCCCTATCACCCTCTTAICCCCCCAC^ 1015 
TC TCCCCCC ACC T AT ACT TOACAACCCCCC AT ACTCB6ACAA1 ACCCCCC ICAT BSAAAA T ACSCAAC TCCCT ACCCCCAACCTAAflTCTTCCTCCCClAfiTCATACTCCACCAAACCCTCCACCACCCCCCCACTCCBACtSTC 

TCAACCTCCCTTATCCCATTCCACTAACCTCCTUTCTCCCACCAOUTTACCCAATCT^^ 1160 
ACT TCCACCCAAT ACCCT AACCT CAT TCCACCAC1 AC ACCCTCCT C CT AAT CCCTTCCAACC ACCACCACCtAATCACAAATCCTCCTCTCSCCCCCTCACTTTCATCCCTCCTCTCTCACACCCCTTCCCCCTAATCCCCACTT 



i 



8 

111 



CCTTACAACATCAAACATTAAACttTCCTATTICAAttTCCCCTCCATCCAC^ 1305 
CCAAT CTT CT ACTT TBT AATT T CCtACC AT AAACTTC CACCCCACCT ACCTCTCACCCCACCTC1 CAACAT TCCCACCCTCCAT ACCATCTCTACTTCCCACTTACAACTCACACTTCeATATCATI 1CCAACTCCCCCACAAAC 

CCTCiTCCCCCCCCUCACTCCATCTTCACACC&AGTTCAATTKACTCACTCTCCM^ U50 
CC ACAACCCCCCCCAT CT CACC T ACAAC1 C TCCC TC AAC T f AAAC T C AC T C ACACCCCAC C T C TCTCCCACCCC 7 ACI AA TCC C67AACCACCTCCACCCTTCAATCCCCTCTTCCTIAAAACCCAICCAATCC7CCCAA1ATCA 
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DNA sequence and restricnorimap of the £ co//MG1655 rmD operon (5s-spacer- 
23s-spacer-16s), with restriction sites noted for enzymes cutting 6 times or less 

(continued 1). 
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DNA sequence and restriction map of the E. cotf MG1655 rmD operon (5s-s pacer- 
23s-spacer-16s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 2). 



9 wm 



CGTAC TCCT TCACTATCCC7CACTCACCACTATIIACCCTTMACCATCCICCCCCCAI ATlCACACACCATACCATCTCTCCCCCCClACTCAtCCJ^ICACACCAlCTCCAT TTTTCTC1ACCCCCCTCTCACCCTCTATCC 304S 
CCAT^CAACTWTA^CACTCACTCCTCAIAJUTCCCAACCTCCTACCAOCCCCTATAAfiTCTCTCCTAlCCTK 



CCtCCCTmCACACCCWCACTAACACACACACTCATTCACCCTCTCCCCTCCTCCCK^^ 3190 
CCCC6CJaA£CTCTCCCAACCTCATTCTClCTCTBAtTAACTCC6ACAC^^ 



TAACCtAICCATTCACTtMTtAUCICtCICCAAACKACTraCTneCC^ 3335 
ATTCCAmCTAACTCAATTACIAICACACACCmCTCTCACCCAAJ^Ct^ 



5« 



CCACCCTCIACCCtTASTCCCnAACCTCACAACCCCAACAlCTIKACTUACAC^^ M mn 
CCTttCAtATCCCAATCAJCCCAATTMACTCTTCCK^^ 

23s begins ^ 

8 

I 

BCACTCAACCTTTCCACCIACACUTCAACTATrnTTATITAATCAClACACA^ 362$ 
CCTCACTTCCAAACCrCCATCTCACJtfirCAIAAAAMIAAATIACTCATCT^ 



8_ B 



1 1 



1 ! 



AJUUlCCTCTTtAAATTTCCCCTCCAAATTTCCTACCCCTCACTCCACTT^^ 3770 
T T T T C6ACAACT TT AAACCCC ACCT T 1 AAACC ATCCCCACT CACC TCAAC1 T CCT CCC TCCACTCCCAAT ACTCCCCAXCCSA£ATTCCTCCJ^TCCATCTTCCCAtATCTCCAAAATCACCACIAAAACTACTCTCTTACACAX 



fa] ii i \ 



ICACCACTCCAAACTACCCTTCTTTAACCTAACCACCTCATCCAACCCCACCTTCCCCTACCfiTTACCTTCTTACCACTTCACCCCACTCATCAATCACAAAaTCCTAACCCCCC TtCCCAACGTT AACCT ACCTACTTCTTTTC 3915 
AC TCCTSACC T J ICA TCCBAACAAA r t CCAT T CC I CCAC I ASCTTCCCCTCC AACCSCATCCCAATCCAACAATCt TCAACTCCCSTCACTACTTACTCTTTCACCATTCCCCCCACCCCTICCAATrCCAICCATCAACAAAAC 

A 16s ends 



1 IFi 



CI 

17 



s 5 

31 



1 in 



3 — 



s fc. 



CTCCCCTICCTCTCCCCACCTeCCtTeTCmctATCCCCCATTCTACC^^ 4205 
CACCCCAACBACA6CCCTCCA«CAACACAAAC*TACCCCC!AACATCCrCCACAC*TCCCCACCACCArTCCCCC 



1 ll! 



EACCCCTSECAACAAACCATAACCGTTCCCCTCCUCCCCCACTTAACCCAAC^^ «50 
CTCCCCAUCTTCmCCTATTCCCAACCCCAMAACCeCCTCAATTCCCTTClA^ 



WO 99/05325 



42 / 67 



PCT/US98/15464 



1 Ow+M 



DNA sequence and restriction map of the £. cotf MG1655 rmD operon (Ss-spacer- 
23s-spacer-16s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 3). 



] i fpf 




i i i ii^i , i i nii 



uiujxot ti S w 

r i 1 i 



e 8 , 



11 w 



CAlTi^CCTTCCACCtTCCCIATTACCCCCCCTCCTCCCACCOAOTTAMCCCTCCTfCTTCTCC^lAACCTCAATCACCAAACCTATIAACUIACTC «30 



! i 



1 11 11 



TrilArATittCeCATecCTCCAlCACttTlCCCCCCATTCTCCAAIATTCK^ 5075 



8 5 

11 



, i WW 

ATTArTrACCCCTMCeCACTCCTCACCCAAACACC^^ 53*1 



li i 

C A T C A T C AAAC 1CT1CAA1 
C1ACTA6TTTGA6AACTT4 

16s begins j 
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Restriction map of the zmD operon p6s-spacer-23s-spacer-5s) of £. coli MG1655, 
with restriction sites noted for enzymes cutting 5 times or less. 
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. 1 cont'd 

Restriction map of the rmD operon (16s-spacer-23s-spacer-5s) of E. coll MG1655, 
with restriction sites noted for enzymes cutting 5 times or less (continued 1). 
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Restriction map of the rrnO operon (16s-spacer-23s-spacer-5s) of £. co// MG1655, 
with restriction sites noted for enzymes cutting 5 times or less (continued 2). 
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DNA sequence and restriction map of the E, coll MG1655 rmE operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less. 



"in 



-I 



8 = 



l\\ W , \ . \ \ 

4 16s begins 

ills- -„ = - i 



1 



If 



1 



1 



5 5 



IP M \) w 



CCAACCCCCCCCCCTI 
CCTTCCCCCCCCCCACC 



- » g a - 



_ . . H 1 



1 ^ 



II 



4«i 



CAAACCA6ACTC 1 160 



isliiiii 

i 



" - 2 
2 i 



i Ti 1 US* 



Willi 



1 1 
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DNA sequence and restriction map of the £. coll MG1655 rmE operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 1). 



-- — — ■-- — — ..... ... .. »«**m» *~*t££££TTCCATCACCTCC TTACCT"* »•*•« 

ACCCCAACCTACTC6ACCJUTC& 

16s ends ^ 



TrCj^ACO^CtnACCACTTTCTCATTCATCACTC^ IMS 
j^^jQ|^£QCfiAATCCICAAACACTAACTA£TCACCCCAC I ICACCATTC7TCCATTCCCATCCCCTlCCACCCCAAtCTACTCCJtfCWTCCAATTTCTTCCCAHlCAAACCTCACCACTCTCTCt4ACACACTAf CITTCACT 



A i n 1 fl i fl I 



,,.,M-A»i^rm"^CMCr^rTCATAfCTf£gCTTgCTCTACACC^^ »7W 
T?TCCnCCCCACAACCCTTCCTCTBACTATKA 6 cn ^ 

! i i i i n 

AT CTCAAAACUAICTTCC<CCTCATCTTTC^TATTTCCTCmAAAAAm^ 1B8S 
T ACACTTT TCAC7 ACAACCCCACTACAAACTCT AT AAAC6ACAAA1T T1 TAfiACCTACTTCCACT TTtAAtTTlCI tJU: TIC f T 1 1 CAACAACCAClCAtACACT TTAAAACCBTTCTCCTAC TACTTACCATTCTTTCT AC 

^ 23s begins 



TfrBeeTTCI CACC TTAACCBACTAACCCTACACCCTCCATCCCClCCC AC1CACACCCCAT0AACCACCTCCT AATCT6CCA1AACCCTC6fitAACCTBATATCAACC6TTA1 AACC6CC6AIT7CC6AATC6CCAAACCCACT 2O30 
AACCKAACACTCCAATTCCXTCATTCCCMCTCXCACCTACCC*^ 



CTCTTTeCaCACACTATCATTAACTC^TCCATACCTTAATM CCC CAI tCtt^ 217$ 
CACAAACCt6TCTCATACTAATTCJUT1ACCTATCCAATTA^ 



iTrjLCTCTCTCTCTTJuTTCCAACTCTnC P V^ 2330 
T ACTCAC AC ACAC AAT C ACCTTCCCACACC TT T CCCCCCCCT ATBTC CC AC TCTCCCCCCATCTCTT T T T AC6TS T AC CAC AC TC OAttlACTCATCCCCCCCTCTCCACCATACCACACACTTAIACCCCCCTCCIACCACCTl 



eetTAAAIACTCCTCACICACCCATACrCAACCACTACCgrCACCCAAACCCSAAAAiUACCCCCCC^^ 2465 
^C^T 1 74f glACCACTCACTCCCTATCACTTCCTCATCCCACTCCCTTTCCCrTTTTCTTCCCCCCCCTCCCC'ICAX r TIT TCT TCCAC T ITCCCACAICCATCTTCCTCACCCTCCTCCCAArcCCCACACTCACCCATCCAAA 

TCTATAATCCCTCACCCACTTATATTCTCTACCAACSTTAACCCAATACC£CACCCCAACCCAAACCCACTCT1AACTC^ 2610 
ACA I AT T ACCCAC TCCC TCAA I Al AACACATCCT TCCAAt TCCC T 1 AT CCCC TCBCCT T CCC I T T CCC T C ACAA T TCAC CC CCAAT tCAACCTCCCATATCTBCCCTTTCCSCCACTASATCCCTACCCCTCCAACTTCCAACCC 



TAACACIAACKCACCACCGAACCCACTAATCTTCAAAAAIIACCCCATM^ 27S& 
ATlCTCATTGACCUCTCCCTTGCCrCATTACAACTTTTTAATCCCCTACT^^ 

sf i ? i mm 



1 i 1 \W 



CI ACACCAC TCT T I CCCC AACCCCCIC A T C CCCAC T T ACC AAC CCCAT CC AAACT CC CAATACC CC AC A A TCTTATC ACCCCACAC AC ACCCCCCCTCC T AACCTCCCT CCTCAACACCCAAAC AACCC ACACCCCCACC T AACC 2900 
C A7C KCTCACAAACCCC T TCCCCCACT ACCCCTCAATCCT TCCCC I ACCT TTCACCCTTAICCCCTCTTACAATA6TCCCCTC1CTCTCCCCCCCACCATTCCACCCACCACTTCTCCCTTTCTTCCCTCTCCCCCTCCATTCC 
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DNA sequence and restriction map of the E. co// MG1655 rm£ operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 2). 



HI1 i s i i 



IS i- 



i 



I 




1 1 1 IHP^I 1 

' 1 _ , . „_ r . ' , . . 4 . f ccecr accC ICABTC61CCCC 1 AACCC6ACCC C CAAACCC CT AGTC OAT CCCAAAC AC CI TAATATTCCTCTAC1 TCCTCTT AC TCC 6AAC6CCCC AC CCACAACCC T 3335 



if Wfl if 



-8. 



_ .. 1- l 



hi 



!. 

11 



s 3r 



' ,^,>t P fi.../-iriiirrpiM-irfirr 



3 



ACATCACCACCTTACTCCAATCCCAT 
TCUCTCCTCCAATCACC1 TACCCT ,* 



\ 1 H 
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DNA sequence and restriction map of the £. coll MG1655 rmE operon (16s-spacer- 
23s-spacer-5s) t with restriction sites noted for enzymes cutting 5 times or less 

(continued 3). 

e^TfUTArCCteCAJ^ACmAlATCCACKeCCTCTTTCCCACCTCeAlCTCC^^ "95 
CCCACT AlCCCm 1TC 1C AAC1 A 1 ACC I CCCCCCAC AAACCCTC^ 



CACACTTCKTCtCTATCtCCCCTCCCCCCTCCAeJJlClCACCCCg^^ WO 

= i si 



eril^AACmeiAACCATCAAACllCCCCCCA^ICACTTCTCCCTCACTCeH^ «B5 
CcictnWtA^TUCTBCfltWAeeCCKtCIUICA^ 



Mi f i 

gCTCA££CTTAACC11 ACAACCCCCAAC6T6T TTTCCCCCATTCABACAAC>TTTTCACCCT6ATACACATtAAATCfc6A> C CCA C AA6CCCTCT6ATAAAACACAATTT6CCTCCCCCCACTACC6CCCTCCTCCCACCTCACC «930 
CCAXlScAATT^TC^KCKTTCCAC»AAJlCCCCCTAA£T 

23s ends f t 53 jggjlg 



CCA1CCCCAA£TCA£AACTCAAi^CCCCTACCGCCCATCeTACTCTCCCCTCTCCCCATCCCAGAC1ACCCAAXTCCCACCCAT 8014 
MTACMC^ACT^A^CCMCAKCCCCCTACCAlCAttt^^ 

5s ends 4 
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Restriction map of the m»E operon (16s*pacer-23s*pacer-5s) of £ co// MG1655, 
Jlth restriction sites noted for enzymes cutting 5 times or less. 
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Restriction map of theunE operon (1 6s-spacer-23s-spacer-5s) of £. coll MG1655, 
with restriction sites noted for enzymes cutting 5 times or less (continued 1). 



Cfc-IOI 

ctmi 



C*p*5l 

Oral 

Oram 

(Ml 



Eetl 
Edt 
EcMI 



NNl 
HMD 

Hal 



hwci 

Hncn 
HMO 



200 400 600 800 1000 1200 1400 1600 1600 2000 
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3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 5000 

■ t i ■ i i i i i i 3 



HS0B2I 
L*p1270l 



Me 11061 
I6U1131 



Ppul253t 



PBp1«08> 
PspAI 



Pvutt 
Rhct 



f 
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Restriction map of the jrn£ operon (16s-spacer-23s-spacer-5s) of £ coif MG1655, 
with restriction sites noted for enzymes cutting 5 times or less (continued 2). 
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DNA sequence and restriction map of the £. co// M61655 rmG operon (Ss-spacer- 
23s-spacer-16s), with restriction sites noted for enzymes cutting 5 times or less. 



- 9 



3 



ATCCCTCCCACTTCCCIACTCTCCCATCCCCAC^CeC^^ 

^ 5s ends 

ii \ 

CCCt A1AACACCT TCCStCT 1 
CtCCTl ncrctAACCC ~« 



:taccccccccacccaaj 

UTCCCCSCCCTCCCTTl 

5s begins f 



ACCCtCCrCAttCAAATTCTCTTTTATCaCaCCCCTTCT 145 
JCCCCSCCCT CCCTTTAACAtAAAAT ACT CTOCCAACA 



eCCTTCtCATTTmeTCTATCACCrrCAAAATCTTCTCICAlCCCCtAAAJXA^ 290 
CCCAACACTAAATT ACACATACTCCCACT T T T ACAACABAC1 ACCCCCTT TTC r CCAACCCCCAACAT TCC AAT T CCCACTttCAACT AAT C ATCCCCJUTCCACT1 CCCT ABCCAC CCCAAT 0TCTCCCCCCCA1 ACTTCC ACC 

23s ends 



8- 



9 ill 



i i 



TrCTCfUAACCTTCCTKACCACCCTTAAACCCTCACCCACAACTCATCK^ 435 
IttACAACTTCCAAMAACTCClEC^ 



I.I 



11 



i i ^ Iffl 



CAICCCJCCACTCCCXTCCTCUCIACTACCACCACCCCCCCTCACTTCTCCA^ 580 
CTACttACCTCACCCtACCACACCATCATCCTCCTCCCCCCC^^ 



CElACTTrjraCC£ACCATCTCATUCCCCACATCGACCTCCtAAACACCCCtCTC^ 725 



s\\\ iM \ 



CACTATCACCTCCTTTCCCACCTCCTCWCCCCTCACCCICCCAGUAACCI^ B70 
C^CAT ACICXACEAAACCCTC&ACEAKCtC&CACTCtCACCC'tCACT TCtACCEAATiUlCCt AACCT&ATTCCACCACTfcCACCCTCCTCCTlATCCCTTCSAACCACC*CC*CCCAAlCACAA*TCC TCC1C TCCCCCCCICA 



CAAACTACCCACCAt»CACtCTCCCt*ACCCBCATIACC6CTCAACBTlACAACATCAAACATTAAACCCTCCTA »0»5 
CmCATttC TUT^CTMCACCCCT TC£CCC 1 AATCCCCACTTCCAAT CTTC1 ACT T TCT AAT^ 



3= 



TCAATCnCACTCTCAACCIAlACTAAAMTICACCCCCTCTmCCTCTTCCCCCCW H60 
ACT 1 AC AACTCAC ACT TCCA1 ATCAT T ICC AACY CCCCCACAAABCCA£AACBCCCCCCATCTCACCIACA^ 



1 1? 



I T 



IT/ 



AACTTACCCCACAACCAATUCCCIACCITACCACCCTIATACTTACC^ 1305 
T1CAAICUC1CTTCCTT AAAUCA1CCAAT CCTCCLAA I AICAATCCCCCCCCCA^ 



ill 



111 



CTTTCCTCTTTCCACACTCCTCTCTTUIAATAAACACTTCCACCCAK^ "50 
CAAACCACAAACCTCTCACCACACAAAAATTAUICTCAACCTCt^ 



WO 99/05325 



54 / 67 



PCT/US98/15464 



"Fig.. 7 



DNA sequence and restriction map of the £ co// MG1 655 rwG operon (5s-spacer- 
23s-spacer-16s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 1). 



MA 



1 



rirrrrTr.rrrTiriTTTTrrrrATTT^^ 17,0 



»5 - 



W \ WW 



s i§ 



^rT*rcrrTTTrWtrTrcWTUCCMTCWCT~C^ ,8BS 



8- =- 



\ \ \ 



iiffll Ui 

xTcoceaTci 
:cacccccac4 

" ~m 

IAC0ACCCC 

tiecTccsc 

f 

TA 

AT 

1 i sli 



W9 \ \h w 



o£fi5=5 



"u«Tccc t .c« ? e ?5 .Tcm,ccccT^ 



1 
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DNA sequence and restriction ms/p of the & coll MG1655 rmG operon (5s-spacer- 
23s-spacer-16s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 2). 



TCCTCCraitfmCATCTTTeASTTeCCCCCSTTCCteTC*TTMCCUTS6ATTCMTIiU1 3045 
AcaAlCCtCATWICIACAJMUdCI^WCCC*^ 



1 



TCCtACATTACtACCTeCTTCATCCCeiCICACTCCCACCCCATCCACCCTCTAICCTTACK^ 3190 
i^rrr t itTrcTEf ACCAA.C1 AC^CCACAC T KACCCTCCCCT ACCTCCCACATCCCAATCACCCAATTCCASTCTTCCCCTTCTACAAACTCJULCICTCAACCCT T T TAAAC IC TC ftACTCtTTCTTCAAACT AACAACTC AC 



23s begins 



n 



ffl i ¥ 



TTTCAATITTCACCTICATCCACArTTrTAAACACCAAATATATCAAACAACACTIAACACTeT^^ 3335 
AAACT I AAAAC I C CAAC 1 ACC T C I AA AA A T T t C TC d 1 1 At AI ACT 1 TCT TC f CAATT CT CACACAAAACTC1 AT AAC I CC ACCCCC TCAAACrCACTeTTTCCTCCTTCACCCCACCCCATCCCCTAACCITCCCCACAATCCC 



5a t 

1? 1 



CCGtCJUUttCCCCCTCTCCTCCCCCTCIAna C flAACC CC * t AClUAAATt«^ 3480 
CCCACTmCCCCCACACCACCCCSACATCTCCTTCCCCTCTCCTTTTAACBAATACTCCCCAACCCACTATAA^ 



1 11 

U 

n 

4 16s ends 



AAATKCTTCCTATTCACITTTCATCACACAATCTCTCTCAKACTCC 3B2S 
TTT6CCCAACCAT AACT BAAAAC1 AST C T CT T ACACAC AC T C6T 6ACCT TTC ATCC8AAQAAAI ICCAT TCCT CC AC T A6CTT66C6TCCAAC6CSATCCCAATCCAACAAT CCT CAACT CCCCtCACT AC T T ABTCTT TCACCA 



13 



AACCCeCCTCCCCAACCTTAACClACCIACTTCTTlTCCAACCCACTCCCATCCTCTCA^ 3770 
TKCCCCCACCCCmCAATTCCATCCATCAACAAAACCTTCCCTCACCCTtf^ 



:= S 



i in 




TTCCACAeTCCAATCCMACTACCACCCACTTTATCACCTCCCCTTCCKTCCC^ 3915 
AACCTCTCACCTIACCCCTCATCCTCCCTSAAAIACICCACSCCAACC^ 



1 if 



CACTI TATCACTCCCACTCTCCTTTCACTTCWCCCCeCACCCCT6«CAACAJ^CCATAACC0TTCCCC TCCT TCCCCW 4060 
CTCAAATACTCACCCTCACACCAAACTCAACSCCC6CCCTCCCCACCCTTCTTTCCTATTCCCAACCCCAXCAACCCCCTCAATT 



5 1 



w 



ACCCACATUTCATCTCTCAAAACTTCCCTCCATCTCAACACCACCTAACCTTCTTCC^^ 0205 
TCCCTCTAACACTACACACTTTTCAACCCACCTACACTTCTCCTCCATTCCAACAACCCCA/— ' -~ — ' * 1 " ' 



10M< 



2iJ 



11 1 




si 1 i 



ACCCCCTCCACTIAACCCCTIACC1CCCCAACCCACCCCICAACCCCACJUCCTCCAACTCCACATCCTTTACCCCCTCCACTACCACCCTATCTAATCCTCTTTCCT 4350 
TCCCCCACC1CAATTCCCCAATCCACSCCTTCCCTCCCSACTTCCCCTCTTCCACCTTCACCTCTAC£AAATCCCCCACCTCATCC1CCCATACAT 
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DNA sequence and restriction map of the £. co//MG1655 miG operon (5s-spacer- 
23s-spacer-16s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 3). 



ill 



m 



rassdsSSBSBHSSSSSiS^ 



8 _ 



1 11 11 



i i 1 ^ 



! i 



ASCT CAT CAAT AC6SSCACST ACTCC6T C CAACCG TCTCTAATGACTI 



1 1 

C4AACTCTTCAATTT 5000 
CTTTCACAACTTAAA 

16s begins ^ 
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Restriction map of the rroG operandi 6s-spacer-23s-spacer-5s) of £ co// MG1655, 
with restriction sites noted for enzymes cutting 5 times or iess (continued 1). 



200 400 600 600 1000 1200 1400 IMP 1800 2000 2200 2400 2000 2B00 3000 3200 3400 3600 3800 4000 <2Q0 4400 4600 4BC0 5000 ^ 
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Restriction map of the m»G operon (168-8pacer-23s^pacer-68) of £ co//MG1655, 
with restriction sites noted for enzymes cutting 5 times or less. 
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Restriction map of the rniG operon (16s-spacer-23s-spacer-5s) of £. co// MG1655, 
with restriction sites noted for enzymes cutting 5 times or less (continued 2). 



Stol 
SyLTI 
GtfSPt 
SynD 
t«j n 

T*45l 
1*1111 
TtilUB 
Uba1220t 

Ubaiaost 

VBB1I 

vwi 
X&al 

ton I 



goo 400 600 pop toco tan »*po icoo iboo 2000 2200 2<p0 2600 



J 1 I I 



-J t 1 I 



X= T n= T 



J b— 
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DNA sequence and restriction map of the E. coll MG1655 rmH operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less. 



AAATTSAACACrtTEATC 
WAACTTCTCAAAClAe 

^ 16s begins 



i - I - 

if r 



1 



' ** r TT ^ CCT ^ cccl ^ CTjLACJ ^ TCf:AftDTt: ^ US 

^J^StIIIIcIaS^^ 



«T»*rT*^i e c a a a r ee T A£t t A A T AC C EC A 1 A ACC T CCC AACACC AAABAGCCCfiACCTT CCCCC C T C TTCCCATCCfcATCT BC CC AAA T CCC AT T ABC T AGT ACBT C6C6 T AAC CCC T C AC C T ACCCCAC CA TC C C T ACC 2BO 
CCMM?WwIcCTmCclTCw"lTcS 



111 1 



,^Tr.^Ar»i.r»» C ArrAefrATAricCAACTIUMCACCBTCCACACTCCTACC«ACCCACCM 



ffl 



_ _ _ _ ^r^^-a m < • <»e a£ 1 a AACT T ATCT1 TCCT C1TTCACCT T ACCCCCASAACAACCACCCCC I AAC TCCBT6CC ACCAECCCC6CT AAI ACC6ACCCTCCAACCC T I AA1C66AATT ACT6CCCCT AAACCCCAC 680 




1 I S I I I s 






I 



if 1 1 2 l1 11 - 1 ; 



STCCCCACCCCCCCCCTIBACTTTCCICTCAC 



_ R- -_ 



..-.A.J..»,M.»»»>»>^reecATeArcTCAACtc*^^ 1305 



3 |~r|i 



ill | 



s 



1 I 'i \ \ filP 1 FT 1 
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DNA sequence and restrictionTnap of the B. co//MG1655 rmH operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 1). 



TCCCCACCCCCCTTtfCACTTtCTEAlTCATCACTCttCTCMCTtCUADUre^ 150S 
ACCCCTCCCMC^TttTCUAACACTAACTACIC^CCCACTICACC^ 

16s ends ♦ 



CttlAJUUCCICTACJUUfcTTCTACCTCACCTCOTTACACCttACCCCrW^ I7 W 
CTCATTTlKAMTCTCCeAACATCC^KCAUAATClCCXCTC^ 



if 




TCACXTK&WACCKCTCCTTTttACGCACC^CTCTttKTTC^^ |M& 
ACTCCACCC1C1CCCCCACCAAACCTCCCKCTCCACACCCCAJICCTACGCCCTATC 



CUTCJUCCTC^AAATTCJUUICACTC^UKAATCAAACTTCTTCCT • 
C T ACTT CCACT1 Tl AACTT TCt CAC T T Ct t ACTT TC AACAACCACTC ACACACT 1 1 AAAACCCT T CTCCT AC TACCT ACCBT TC71 TCT ACAACCCtAACACTCCAATTCBCTCATICCCAlBlCCCACC1An«A«CTCACT 

4 23s 



MflC«At tt*»^ TeCTMtCICCCAtiUteceTC6eT»AWCMATCAMCCIIAf*JtfeCCCCmTCT«M^ JI75 
CTCKClACJTCCTCCACUmwWCClATTCCCWCUITCCUTMUTIUC*^^ 

fii i 



BlUACTBAJUCATCTAACTACCCCCASCAAAACAAATCAACCSACATTCCCCCACTAg 2130 
CCTTCAeTlTCTACATTCATCCfittTCCTTTTCTTTttTTCttTCTAACC^ 

CCCCCClACACAAAAATCCACATCCTCTCACCTCB^TBACTACCCCCCtUCACCTCCTATCCtt^ 2 q 0 5 
CCCWCAT CT C T TT T T ACCTC I ACCAC^ 1 C GACC 1 AC TC ATC CCCCCC TC T CC^^ 





CTCAAAAA(UACCTCAAACCCTCTACCTACAACCAC7CCCACCATCtTTA^^ 3010 
TTTtTTMKCCCCTCCCCTCACTTTTTCTTCtACTTTCC^^ 



CCDAACCCAAACCtUCTCTTAACTCCGCCUAACTTCCACCCTATACA^ 2755 
MCTTCCCTTTCCCTCACAATTC^CCCBCAATTCAACCTCCCATATCTCCCCTTTCCCCCACTACATCMTACCCCTCCAACTTCCAACCCAIW 



f 1 

* " -CCATCCCCACCTT6AACCTTBS6TAACA " 
COT ACCCCT CCAAC TTCCAACCCATT61 

\ \ 



CTCCCTCCraTI^UACCCCAATCAAACtCeCACATMCTCCTICTC^ 2900 
CACCCACCCCCACTTTCCCCTTACTTTCCCCCTCTATCMCCAACACCC«^^ 
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DNA sequence and restriction map of the E co// MG1655 rmW operon (16s-spacer- 
23s-spacer-5s), with restriction sites noted for enzymes cutting 5 times or iess 

(continued 2). 



T6C«AAT«WBAttttlCTU?CACttttC*XttAC6ttC0M 3M& 
ACMnATCCCtTCTTACMTACTCCCCTCT6TCTCCCCCCtACCA1TCCACCCA6CACTICTCCCTTTBTTBWTCTCCCMT 



lCrtCCCTTA£AACCAeCCA1CATriAAACAAACC6TAAYACCTCAC1CCTCCACTCCCCCTCCCCCCAAUTCTAilCBCCCCTAAACCATCCACCUA^ 3190 
AtAACKMTCTTCCTCCCTACTAAATTTCMTCCM^ 



I; i 

5 S 3 



B 



551 " 



fAACCCTClC^CTCTACTClSACCtATCttCCACCTATCACAACTtCC^^ 3335 
aTTCW(UcmCKAlCACACTCtATACCAtCTCCAIASTCTICACCCTTA^ 



rfTfTftftCT;rc» ffftr ,>g *** egreTAeTreiTe rft ' a * r * ' :eTTAA1ATlgCTCT * CTTCCTCT1 * CTCC caaccco 

CSCMTTCCCCTCCCKUTCCC^^ 



iCAACCCTATCTTeCCCCCCCCACCCTTCTCCCCCTTTAACCCICTACCCTCSTTTTCCACC 3480 



% 1 



rtmrrMMMKUCCftc^rcTCATCACEACE^^ 3*25 
StTABCCC^ACTICCCJU"^ 



3 !i 
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eeAACCettTTCACACAACTCCCCTCAAC^eiABttAAAATCM 377° 
KTlScCfiAAC^TCTTCACCCCACTTCCTT^ 
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cnCCttAtACTCTCTCG.CCCTAGTnCACTCK^^ "350 
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DNA sequence and restriction map of the £. coif MG1655 rmH operon (16s-spacer- 
23s~spacer-5s), with restriction sites noted for enzymes cutting 5 times or less 

(continued 3). 



CnUCCACCTCCCAJUCtASCTCATACTCATCCCCICCTTCtCAATC^ 4495 
CCCTCCTCCACCCTTTCCTCCACUTCACTACCCCACCAAPtfnACtTTCCCCSIACCe^ 

8§|„ a- - 

\ W $ i! M ! i 

CttTCATCAtATCCTCCCCCTSAACTACCTeCCAA^TATttCTCTTCCCtATTTAAACr^^ 4640 
CC^TACTCTACCAttCCMtTTCATCCACCCTlCCCATACCC^^ 

]] ] 

UCTACCACAttACCCCACT^CCCATCACTCCTBTTCCCCTlCTCATCCCAATCCtACTCtCra^ 4785 
ATCATCCTCTCCICCCCTCACCTUCtACTCACCACAACCCCAACACIACCCTTACCCTCACCCCCUTCCATTTACCCCTTCTCTATTCAC 

i !l! ill m 5 i 

" " ±. 

ACTCTCACCACnCCTTttAACTTCTCCTCCTCCAAClATCCCaCCACW 



\t ft W In 



23s ends ^ 

I i \ 



UCASCCTeATJ^ACATTAAAICACAACCCACAACtCCTCTCATAAAACACAATTTCCCTCCCC^ 5075 
AACTCCCAC 1 AT C1C T AATT1ACT CTTCCCT C TTCCCCACACT AT TT TCTCTTAAACCCACCCCeCttATCCttCCACCACCCTCCAn CraiACfittTTCACTCTTCACTTietCttATCCtCBCTACCATCACACCCeACAC 



CCCATCCCACAGT ACCCAACTCCCACCCAT 5105 
CCCTACCCTCTCATCCC TTCACCCTCCC Tf 

5s ends 



^ 5s begins 

105 
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Restriction map of the rmH operon (16s*pacer-23s-spacer-5s) of £ coff MG1 655, 
with restriction sites noted for enzymes cutting 5 times or less. 
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Restriction map of the m>H operon tf 6s-spacer-23s-spacer.5s) of £ coll MQ1655, 
S restriction sites noted for enzymes cutting 5 times or less (continued 1). 
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Restriction map of the rmH operon (T6s-spacer-23s-spacer-5s) of £ co// MG1655. 
S restriction sites noted for enzymes cutting 5 times or less (continued 2). 
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